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IMPORTANT INFORMATION ABOUT THIS DOCUMENT 

Following are changes to the text in this manual that were not available until press time. 
Please note these changes when using the information contained in this manual. 


Documentation Changes 


Location in Document 

Document Currently Reads 

Change to Document 

Page 5-34: 

PCMPGTB/PCMPGTW/PCMPGT 

D—Packed Compare for Greater 
Than 

Heading: Description 

Paragraph: Four 

The PCMPGTD instruction 
compares the signed words in the 
destination operand to the 
corresponding signed words in the 
source operand. 

The PCMPGTD instruction 
compares the signed 
doublewords in the destination 
operand to the corresponding 
signed doublewords in the source 
operand. 

Page 5-51: 

PSRLW/PSRLD/PSRLQ-Packed 
Shift Right Logical, 

Heading: Description 

Paragraph: Four 

The PSLLD instruction shifts each 
of the two doublewords in the 
destination register to the right by 
the number of bits specified in the 
count operand. 

The PSRLD instruction shifts each 
of the two doublewords in the 
destination register to the right by 
the number of bits specified in the 
count operand. 

Page 5-56: PSUBSB/PSUBSW/ 
Packed Subtract with Saturation 
Heading: Description 

Paragraph: Three 

The PSUBB instruction subtracts 
the signed bytes of the source 
operand from the signed bytes of 
the destination operand, and writes 
the results to the destination 
register. 

The PSUBSB instruction subtracts 
the signed bytes of the source 
operand from the signed bytes of 
the destination operand, and writes 
the results to the destination 
register. 

Page 5-56: PSUBSB/PSUBSW/ 
Packed Subtract with Saturation 
Heading: Description 

Paragraph: Four 

The PSUBW instruction subtracts 
the signed words of the source 
operand from the signed words of 
the destination operand and writes 
the results to the destination 
register. 

The PSUBSW instruction 
subtracts the signed words of the 
source operand from the signed 
words of the destination operand 
and writes the results to the 
destination register. 
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CHAPTER 1 

INTRODUCTION TO THE INTEL ARCHITECTURE 

MMX™ TECHNOLOGY 


1.1. ABOUT THE INTEL ARCHITECTURE MMX™ TECHNOLOGY 

The media extensions for the Intel Architecture (lA) were designed to enhance performance 
of advanced media and communication applications. The MMX™ technology provides a 
new level of performance to computer platforms by adding new instructions and defining 
new 64-bit data types, while preserving compatibility with software and operating systems 
developed for the Intel Architecture. 

The MMX technology introduces new general-purpose instructions. These instructions 
operate in parallel on multiple data elements packed into 64-bit quantities. They perform 
arithmetic and logical operations on the different data types. These instructions accelerate the 
performance of applications with compute-intensive algorithms that perform localized, 
recurring operations on small native data. This includes applications such as motion video, 
combined graphics with video, image processing, audio synthesis, speech synthesis and 
compression, telephony, video conferencing, 2D graphics, and 3D graphics 

The lA MMX instruction set has a simple and flexible software model with no new mode or 
operating-system visible state. The MMX instruction set is fully compatible with all Intel 
Architecture microprocessors. All existing software continues to run correctly, without 
modification, on microprocessors that incorporate the MMX technology, as well as in the 
presence of existing and new applications that incorporate this technology. 


1.1.1. Single Instruction, Multiple Data (SIMD) Technique 

The MMX technology uses the Single Instruction, Multiple Data (SIMD) technique. This 
technique speeds up software performance by processing multiple data elements in parallel, 
using a single instruction. The MMX technology supports parallel operations on byte, word, 
and doubleword data elements, and the new quadword (64-bit) integer data type. 
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1.1.2. Performance Improvement 

Modern media, communications, and graphics applications now include sophisticated 
algorithms that perform recurring operations on small data types. The MMX technology 
directly addresses the need of these applications. For example, most audio data is represented 
in 16-bit (word) quantities. The MMX instructions can operate on four of these words 
simultaneously with one instruction. Video and graphics information is commonly 
represented as palletized 8-bit (byte) quantities; one MMX instruction can operate on eight of 
these bytes simultaneously. 



1.2. ABOUT THIS MANUAL 

It is assumed that the reader is familiar with the Intel Architecture software model and 
Assembly language programming. 

This manual describes the lA MMX instruction set and introduces the architectural features, 
instruction set, data types, data formats, application programming model, and system 
programming model of the MMX technology. It also explains how to use the new 
instructions to significantly increase the performance of applications. 

In this context, architecture refers to the conceptual structure and functional behavior of 
MMX technology as seen by a programmer, but not the logical organization or performance 
aspects of the actual implementation. 

This manual is organized into five chapters, including this chapter (Chapter 1), and four 
appendices: 

Chapter 1—^Introduction to the Intel Architecture MMX^^ Technology 

Chapter 2—Intel Architecture MMX™ Technology Features: This chapter provides an 
overview of the lA MMX technology and its new features. 

Chapter 3—Application Programming Model: This chapter describes the software 
conventions and architecture of the lA MMX technology. It defines the steps for writing 
MMX code. 

Chapter 4—System Programming Model: This chapter discusses interfacing with the 
operating system and compatibility with Intel Architecture. 

Chapter 5—Intel Architecture MMX™ Instruction Set: This chapter details the 
instructions, mnemonics, and instruction notations. A full description including graphical 
representations of the new instructions is presented. 

Appendix A—^lA MMX™ Instruction Set Summary: This appendix summarizes the 
instructions by functional groups. 
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Appendix B—^lA MMX™ Instruction Formats and Encodings: This appendix lists the 
instruction formats and encodings. It also lists a detailed break-down of the instruction 
operations and the supported data types. 

Appendix C—Alphabetical list of lA MMX™ Instruction Set Mnemonics: This appendix 
summarizes operand types, encodings in hexadecimal, and the formats used. 

Appendix D—lA MMX™ Instruction Set Opcode Map: This appendix provides a detailed 
encoding table of opcode mappings. 



1.3. RELATED DOCUMENTATION 

Refer to the following documentation for more information related to Intel Architecture: 

Pentium® Processor Family Developer’s Manual, Volume 3: Architecture and Programming 
Manual. Intel Corporation, Order Number 240897. 

Pentium® Pro Processor Developer’s Manual, Volumes 2 and 3. Intel Corporation, Order 
Numbers 242691 and 242692. 

Intel Architecture MMX^'^ Technology Developers’ Manual - Intel Corporation, Order 
Number 243010. 


Refer to Intel’s corporate website for the latest information on related documentation: 
http://www.intel.com 
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CHAPTER 2 

INTEL ARCHITECTURE MMX™ TECHNOLOGY 

FEATURES 


This chapter provides a general overview of the architectural features of the Intel 
Architecture MMX™ technology. 


2.1. NEW FEATURES 

MMX technology provides the following new features, while maintaining backward 
compatibility with all existing Intel Architecture microprocessors, lA applications, and 
operating systems. 

• New data types 

• Eight MMX registers 

• Enhanced instruction set 

The performance of applications which use these new features of MMX technology can be 
enchanced. 


2.2. NEW DATA TYPES 


The principal data type of the lA MMX technology is the packed fixed-point integer. The 
decimal point of the fixed-point values is implicit and is left for the user to control for 
maximum flexibility. 

The lA MMX technology defines the following four new 64-bit data types (See Figure 2-1): 


Packed byte 
Packed word 
Packed doubleword 
Quadword 


Eight bytes packed into one 64-bit quantity 
Four words packed into one 64-bit quantity 
Two doublewords packed into one 64-bit quantity 
One 64-bit quantity 
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Packed bytes (8x8 bits) 

63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0 


Packed word (4x16 bits) 

63 48 47 32 31 16 15 0 


Packed doublewords (2x32 bits) 

63 32 31 0 


Quadword (64 bits) 

63 _ 0 


Figure 2-1. Packed Data Types 


2.3. MMX^M REGISTERS 

The lA MMX technology provides eight 64-bit, general-purpose registers. These registers are 
aliased on the floating-point registers. The operating system handles the MMX technology as 
it would handle floating-point. (See Section 4.3 for more details on register aliasing.) 

The MMX registers can hold packed 64-bit data types. The MMX instructions access the 
MMX registers directly using the register names MMO to MM7 (See Figure 2-2). 

MMX registers can be used to perform calculations on data. They cannot be used to address 
memory; addressing is accomplished by using the integer registers and standard lA 
addressing modes. 



2-2 
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2.4. EXTENDED INSTRUCTION SET 

The lA MMX instruction set supplies a rich set of instructions that operate on all data 
elements of a packed data type, in parallel. The MMX instructions can operate on either 
signed or unsigned data elements. 

The MMX instructions implement two new principles (discussed in section 2.4.2.): 

• Operations on packed data 

• Saturation arithmetic 


2.4.1. Packed Data 

The MMX instructions can operate on groups of eight bytes, four words, and two 
doublewords. These groups of 64 bits are referred to as packed data. The same 64 bits of data 
can be treated as any one of the packed data types. Data is cast by the type specified by the 
instruction. 

For example, the PADDB (Add Packed Bytes) instruction adds two groups of eight packed 
bytes. The PADDW (Add Packed Words) instruction, which adds packed words, could 
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operate on the same 64 bits as the PADDB instruction treating the 64 bits as four 16-bit 
words. 



2.4.2. Saturation Arithmetic and Wrap Around 

The MMX technology supports a new arithmetic capability known as saturating arithmetic. 
Saturation is best defined by contrasting it with wraparound mode. 

In wraparound mode, results that overflow or underflow are truncated and only the lower 
(least significant) bits of the result are returned. That is, the carry is ignored. 

In saturation mode, results of an operation that overflow or underflow are clipped (saturated) 
to a data-range limit for the data type (see Table 2-1). The result of an operation that exceeds 
the range of a data-type saturates to the maximum value of the range. A result that is less 
than the range of a data type saturates to the minimum value of the range. This is useful in 
many cases, such as color calculations. 

For example, when the result exceeds the data range limit for signed bytes, it is saturated to 
0x7F (Ox^ for unsigned bytes). If a value is less than the data range limit, it is saturated to 
0x80 for signed bytes (0x00 for unsigned bytes). 

Saturation provides a useful feature of avoiding wraparound artifacts. In the example of color 
calculations, saturation causes a color to remain pure black or pure white without allowing 
for an inversion. 


Table 2-1. Data Range Limits for Saturation 



Lower Limit 

Upper Limit 

Signed 

Hexadecimal 

Decimal 

Hexadecimal 

Decimal 

Byte 

80H 

-128 

7FH 

127 

Word 

8000H 

-32,768 

7FFFH 

32,767 

Unsigned 





Byte 

OOH 

0 

FFH 

255 

Word 

OOOOH 

0 

FFFFH 

65,535 


MMX instructions do not indicate overflow or underflow occurrence by generating 
exceptions or setting flags. 
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2.4.3. Instruction Group Overview 

This section provides an overview of the MMX instruction groups. See Chapter 5 for detailed 
information on the instructions, including information on encoding, operation, and 
exceptions. 

The fifty-seven new MMX instructions are grouped into these categories: 

• Arithmetic Instructions 

• Comparison Instructions 

• Conversion Instructions 

• Logical Instructions 

• Shift Instructions 

• Data Transfer Instructions 

• Empty MMX State (EMMS) Instruction 


2.4.3.1. ARITHMETIC INSTRUCTIONS 
Packed Addition and Subtraction 

The PADD (Packed Add) and PSUB (Packed Subtract) instructions add or subtract the signed 
or unsigned data elements of the source operand to or from the destination operand in wrap¬ 
around mode. These instructions support packed byte, packed word, and packed doubleword 
data types. 

The PADDS (Packed Add with Saturation) and PSUBS (Packed Subtract with Saturation) 
instructions add or subtract the signed data elements of the source operand to or from the 
signed data elements of the destination operand and saturate the result to the limits of the 
signed data-type range. These instructions support packed byte and packed word data types. 

The PADDUS (Packed Add Unsigned with Saturation) and PSUBUS (Packed Subtract 
Unsigned with Saturation) instructions add or subtract the unsigned data elements of the 
source operand to or from the unsigned data elements of the destination operand and saturate 
the result to the limits of the unsigned data-type range. These instructions support packed 
byte and packed word data types. 

Packed Multiplication 

Packed multiplication instructions perform four multiplications on pairs of signed 16-bit 
operands, producing 32-bit intermediate results. Users may choose the low-order or high- 
order parts of each 32-bit result. 
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The PMULHW (Packed Multiply High) and PMULLW (Packed Multiply Low) instructions 
multiply the signed words of the source and destination operands and write the high-order or 
low-order 16 bits of each of the results to the destination operand. 

Packed Multiply Add 

The PMADDWD (Packed Multiply and Add) instruction calculates the products of the signed 
words of the source and destination operands. The four intermediate 32-bit doubleword 
products are summed in pairs to produce two 32-bit doubleword results. 



2.4.3.2. COMPARISON INSTRUCTIONS 

The PCMPEQ (Packed Compare for Equal) and PCMPGT (Packed Compare for Greater 
Than) instructions compare the corresponding data elements in the source and destination 
operands for equality or value greater than, respectively. These instructions generate a mask 
of ones or zeros which are written to the destination operand. Logical operations can use the 
mask to select elements. This can be used to implement a packed conditional move operation 
without a branch or a set of branch instructions. No flags are set. 

These instructions support packed byte, packed word and packed doubleword data types. 


2.4.3.3. CONVERSION INSTRUCTIONS 
Pack and Unpack 

The Pack and Unpack instructions perform conversions between the packed data types. 

The PACKSS (Packed with Signed Saturation) instruction converts signed words into signed 
bytes or signed doublewords into signed words, in signed saturation mode. 

The PACKUS (Packed with Unsigned Saturation) instruction converts signed words into 
unsigned bytes, in unsigned saturation mode. 

The PUNPCKH (Unpack High Packed Data) and PUNPCKL (Unpack Low Packed Data) 
instructions convert bytes to words, words to doublewords, or doublewords to quadwords. 


2.4.3.4. LOGICAL INSTRUCTIONS 

The PAND (Bitwise Logical And), PANDN (Bitwise Logical And Not), POR (Bitwise 
Logical OR), and PXOR (Bitwise Logical Exclusive OR) instructions perform bitwise logical 
operations on 64-bit quantities. 



INTEL ARCHITECTURE MMX™ TECHNOLOGY FEATURES 


iny 

2.4.3.5. SHIFT INSTRUCTIONS 

The logical shift left, logical shift right and arithmetic shift right instructions shift each 
element by a specified number of bits. The logical left and right shifts also enable a 64-bit 
quantity (quadword) to be shifted as one block, assisting in data type conversions and 
alignment operations. 

The PSLL (Packed Shift Left Logical) and PSRL (Packed Shift Right Logical) instructions 
perform a logical left or right shift, and fill the empty high or low order bit positions with 
zeros. These instructions support packed word, packed doubleword, and quadword data 
types. 

The PSRA (Packed Shift Right Arithmetic) instruction performs an arithmetic right shift, 
copying the sign bit into empty bit positions on the upper end of the operand. This instruction 
supports packed word and packed doubleword data types. 


2.4.3.6. DATA TRANSFER INSTRUCTIONS 

The MOVD (Move 32 Bits) instruction transfers 32 bits of packed data from memory to 
MMX registers and visa versa, or from integer registers to MMX registers and visa versa. 

The MOVQ (Move 64 Bits) instruction transfers 64-bits of packed data from memory to 
MMX registers and vise versa, or transfers data between MMX registers. 


2.4.3.7. EMMS (EMPTY MMX™ STATE) INSTRUCTION 

The EMMS instruction empties the MMX state. This instruction must be used to clear the lA 
MMX state (empty the floating-point tag word) at the end of an MMX routine before calling 
other routines that can execute floating-point instructions. 


2.4.4. Instruction Operand 

All MMX instructions, except the EMMS instruction, reference and operate on two operands: 
the source and destination operands. The right operand is the source and the left operand is 
the destination. The destination operand may also be a second source operand for the 
operation. The instruction overwrites the destination operand with the result. 

For example, a two-operand instruction would be decoded as: 

DEST(left operand) 4- DEST (left operand) OP SRC (right operand) 

The source operand for all the MMX instructions (except the data transfer instructions), can 
reside either in memory or in an MMX register. The destination operand resides in an MMX 
register. 
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For data transfer instructions, the source and destination operands can also be an integer 
register (for the MOVD instruction) or memory location (for both the MOVD and MOVQ 
instructions). 



2.5. COMPATIBILITY 

The lA MMX state is aliased upon the lA floating-point state. No new state or mode is added 
to support the MMX technology. The same floating-point instructions that save and restore 
the floating-point state also handle the lA MMX state (for example, during context 
switching). 

MMX technology uses the same interface techniques between the floating-point architecture 
and the operating system (primarily for task switching purposes). For more detail, see 
Section 4.1. 
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CHAPTER 3 

APPLICATION PROGRAMMING MODEL 


This chapter describes the application programming environment as seen by compiler writers 
and assembly-language programmers. It also describes the architectural features which 
directly affect applications. 


3.1. DATA FORMATS 

3.1.1. Memory Data Formats 

The Intel Architecture MMX™ technology introduces new packed data types, each 64 bits 
long. The data elements can be: 

• eight packed, consecutive 8-bit bytes 

• four packed, consecutive 16-bit words 

• two packed, consecutive 32-bit doublewords 

The 64 bits are numbered 0 through 63. Bit 0 is the least significant bit (LSB), and bit 63 is 
the most significant bit (MSB). 

The low-order bits are the lower part of the data element and the high-order bits are the upper 
part of the data element. For example, a word contains 16 bits numbered 0 through 15, the 
byte containing bits 0-7 of the word is called the low byte, and the byte containing bits 8-15 
is called the high byte. 

Bytes in a multi-byte format have consecutive memory addresses. The ordering is always 
little endian. That is, the bytes with the lower addresses are less significant than the bytes 
with the higher addresses. 
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Figure 3-1. Eight Packed Bytes in Memory (at address 1000H) 


3.1.2. lA Register Data Formats 

Values in lA MMX registers have the same format as a 64-bit quantity in memory. MMX 
registers have two data access modes: 64-bit access mode and 32-bit access mode. 

The 64-bit access mode is used for 64-bit memory access, 64-bit transfer between MMX 
registers, all pack, logical and arithmetic instructions, and some unpack instructions. 

The 32-bit access mode is used for 32-bit memory access, 32-bit transfer between integer 
registers and MMX registers, and some unpack instructions. 


3.1.3. lA MIVIX^i^ Instructions and the Floating-Point Tag Word 

After each MMX instruction, the entire floating-point tag word is set to Valid (00s). The 
Empty MMX State (EMMS) instruction sets the entire floating-point tag word to Empty 
(11s). 

Section 4.3.2. describes the effects of floating-point and MMX instructions on the floating¬ 
point tag word. For details on floating-point tag word, refer to the Pentium® Processor 
Family Developer’s Manual, Volume 3, Section 6.2.1.4. 
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3.2. PREFIXES 

Table 3-1 details the effect of a prefix on lA MMX instructions. 

Table 3-1. lA MMX"^*^ Instruction Behavior with Prefixes Used by Application Programs 


Prefix Type 

The Effect of Prefix on lA MMX™ Instructions 

Address size (67H) 

Affects lA MMX instructions with a memory operand. 

Ignored by lA MMX instructions without a memory operand. 

Operand size (66H) 

Ignored. 

Segment override 

Affects lA MMX instructions with a memory operand. 

Ignored by lA MMX instructions without a memory operand. 

Repeat 

Ignored. 

Lock(FOH) 

Generates an invalid opcode exception. 


See the Pentium® Processor Family Developer's Manual, Volume 3, Section 3.4, for 
information related to prefixes. 


3.3. WRITING APPLICATIONS WITH lA IVIMX^^ CODE 


3.3.1. Detecting lA MMX^*^ Technology Existence Using the 
CPUID Instruction 

Use the CPUID instruction to determine whether the processor supports the lA MMX 
instruction set (refer to the Pentium® Processor Family Developer’s Manual, Volume 3, 
Chapter 25, for more detail on the CPUID instruction). When the lA MMX technology 
support is detected by the CPUID instruction, it is signaled by setting bit 23 (lA MMX 
technology bit) in the feature flags to 1. In general, two versions of the routine can be 
created: one with scalar instructions and one with MMX instructions. The application will 
call the appropriate routine depending on the results of the CPUID instruction. If MMX 
technology support is detected, then the MMX routine is called; if no support for the MMX 
technology exists, the application calls the scalar routine, 

NOTE 

The CPUID instruction will continue to report the existence of the lA MMX 
technology if the CRO.EM bit is set (which signifies that the CPU is 
configured to generate exception Int 7 that can be used to emulate floating 
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point instructions). In this case, executing an MMX instruction results in an 
invalid opcode exception. 

Example 3-1 illustrates how to use the CPUID instruction. This example does not represent 
the entire CPUID sequence, but shows the portion used for lA MMX technology detection. 



Example 3-1. Partial sequence of lA MMX™ technology detection by CPUID 


; identify existence of CPUID instruction 
; identify Intel processor 

mov EAX, 1 ; request for feature flags 

CPUID ; OFh, 0A2h CPUID instruction 

test EDX, OOSOOOOOh ; Is lA MMX technology bit (Bit 23 of EDX) in feature flags set? 

jnz MMX_Technology_Found 


3.3.2. The EMMS Instruction 

When integrating the MMX routine into an application running under an existing operating 
system (OS), programmers need to take special precautions, similar to those when writing 
floating-point (FP) code. 

When an MMX instruction executes, the floating-point tag word is marked valid (00s). 
Subsequent floating-point instructions that will be executed may produce unexpected results 
because the floating-point stack seems to contain valid data. TTie EMMS instruction marks 
the floating-point tag word as empty. Therefore, it is imperative to use the EMMS instruction 
at the end of every MMX routine. 

The EMMS instruction must be used in each of the following cases: 

• Application utilizing FP instructions calls an MMX technology library/DLL 

• Application utilizing MMX instructions calls a FP library/DLL 

• Switch between MMX code in a task/thread and other tasks/threads in cooperative 
operating systems. 

If the EMMS instruction is not used when trying to execute a floating-point instruction, the 
following may occur: 

• Depending on the exception mask bits of the floating-point control word, a floating¬ 
point exception event may be generated. 
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• A "soft exception" may occur. In this case floating-point code continues to execute, but 
generates incorrect results. This happens when the floating-point exceptions are masked 
and no visible exceptions occur. The internal exception handler (microcode, not user 
visible) loads a NaN (Not a Number) with an exponent of 11 ..1 IB onto the floating-point 
stack. The NaN is used for further calculations, yielding incorrect results. 

• A potential error may occur only if the operating system does NOT manage floating¬ 
point context across task switches. These operating systems are usually cooperative 
operating systems. It is imperative that the EMMS instruction execute at the end of all 
the MMX routines that may enable a task switch immediately after they end execution 
(explicit yield API or implicit yield API). 


3.3.3. Interfacing with lA Technology Procedures and 

Functions 

The MMX technology enables direct access to all the MMX registers. This means that all 
existing interface conventions that apply to the use of other general registers such as EAX, 
EBX will also apply to the MMX register usage. 

An efficient interface might pass parameters and return values via the pre-defined MMX 
registers, or a combination of memory locations (via the stack) and MMX registers. This 
interface would have to be written in assembly language since passing parameters through 
MMX registers is not currently supported by any existing C compilers. Do not use the EMMS 
instruction when the interface to the MMX code has been defined to retain values in the 
MMX register. 

If a high-level language, such as C, is used, the data types could be defined as a 64-bit 
structure with packed data types. 

When implementing usage of lA MMX instructions in high level languages other approaches 
can be taken, such as: 

• Passing MMX type parameters to a procedure by passing a pointer to a structure via the 
integer stack. 

• Returning a value from a function by returning the pointer to a structure. 


3.3.4. Writing Code with lA and Floating-Point 

Instructions 

The MMX technology aliases the MMX registers on the floating-point registers. The main 
reason for this is to enable MMX technology to be fully compatible and transparent to 
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existing software environments (operating systems and applications). This way operating 
systems will be able to include new applications and drivers that use the lA MMX 
technology. 

An application can contain both floating-point and MMX code. However, the user is 
discouraged from causing frequent transitions between MMX and floating-point instructions 
by mixing MMX code and floating-point code. 



3.3.4.1. RECOMMENDATIONS AND GUIDELINES 

Do not mix MMX code and floating-point code at the instruction level for the following 

reasons: 

• The TOS (top of stack) value of the floating-point status word is set to 0 after each MMX 
instruction. This means that the floating-point code loses its pointer to its floating-point 
registers if the code mixes MMX instructions within a floating-point routine. 

• An MMX instruction write to an MMX 64-bit register writes ones (1 Is) to the exponent 
part of the corresponding floating-point register. 

• Floating-point code that uses register contents that were generated by the MMX 
instructions may cause floating-point exceptions or incorrect results. These floating-point 
exceptions are related to undefined floating-point values and floating-point stack usage. 

• All MMX instructions (except EMMS) set the entire tag word to the valid state (00s in 
all tag fields) without preserving the previous floating-point state. 

• Frequent transitions between the MMX and floating-point instructions may result in 
significant performance degradation in some implementations. 

If the application contains floating-point and MMX instructions, follow these guidelines: 

• Partition the MMX technology module and the floating-point module into separate 
instruction streams (separate loops or subroutines) so that they contain only instructions 
of one type. 

• Do not rely on register contents across transitions. 

• When the MMX state is not required, empty the MMX state using the EMMS instruction. 

• Exit the floating-point code section with an empty stack. 
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Example 3-2. Floating-point and MMX™ Code 


FP_code: 


MMX_code: 

(‘leave the FP stack empty*) 

EMMS 

FP_code 1: 

(‘mark the FP tag word as empty*) 


(‘leave the FP stack empty*) 


3.3.5. Multitasking Operating System Environment 

An application needs to identify the nature of the multitasking operating system on which it 
runs. Each task retains its own state which must be saved when a task switch occurs. The 
processor state (context) consists of the integer registers and floating-point and MMX 
registers. 

Operating systems can be classified into two types: 

• Cooperative multitasking operating system 

• Preemptive multitasking operating system 

The behavior of the two operating system types in context switching is described in 
Section 4.1.1. 


3.3.5.1. COOPERATIVE MULTITASKING OPERATING SYSTEM 

Cooperative multitasking operating systems do not save the FP or MMX state when 
performing a context switch. Therefore, the application needs to save the relevant state 
before relinquishing direct or indirect control to the operating system. 


3.3.5.2. PREEMPTIVE MULTITASKING OPERATING SYSTEM 

Preemptive multitasking operating systems are responsible for saving and restoring the FP 
and MMX state when performing a context switch. Therefore, the application does not have 
to save or restore the FP and MMX state. 
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3.3.6. Exception Handling in lA Application Code 

MMX instructions generate the same type of memory-access exceptions as other Intel 
Architecture instructions. Some examples are: page fault, segment not present, and limit 
violations. Existing exception handlers can handle these types of exceptions. They do not 
have to be modified. 

Unless there is a pending floating-point exception, MMX instructions do not generate 
numeric exceptions. Therefore, there is no need to modify existing exception handlers or add 
new ones. 

If a floating-point exception is pending, the subsequent MMX instruction generates a 
numeric error exception (Int 16 and/or FERR#). The MMX instruction resumes execution 
upon return from the exception handler. 



3.3.7. Register Mapping 

The lA MMX registers and their tags are mapped to physical locations of the floating-point 
registers and their tags. Register aliasing and mapping is described in more detail in 
Section 4.3.1. 
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This chapter presents the interface of the Intel Architecture MMX™ technology to the 
operating system. 


4.1. CONTEXT SWITCHING 

This section describes the behavior of operating systems during context switching. 

Different operating systems take different approaches for state-saving: 

• Some operating systems save the entire floating-point state. 

• Some save the floating-point state only when it is required. 

• Some may save a partial floating-point state. 

The existing task switch code for lA implementations (including floating-point code) does 
not change for systems that include MMX code. 


4.1.1. Cooperative Multitasking Operating System 

In a cooperative operating system, application tasks can predetermine when it is about to be 
switched out. Tasks can prepare in advance for the switch. 

Application programmers must know whether the operating system performs a state save or 
whether it is their responsibility to perform a state save. 


4.1.2. Preemptive Muititasking Operating System 

In a preemptive multitasking operating system, the application cannot know when it is 
preempted. Applications cannot prepare in advance for task switching. The operating system 
is responsible for saving and restoring the state when necessary. 

The lA MMX technology was defined to support the same state-saving and restoring 
techniques as the floating-point state-saving and restoring techniques. Existing operating 
systems can continue to run without modifications. 
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Figure 4-1 illustrates an example of an operating system implementing floating-point or 

MMX state saving. 

Detecting when to save the FP or MMX state needs to be saved is the same process used for 

detecting when the floating-point state needs to be saved. If CR0.TS=1 (task switch bit in 

control register 0), then the next FP or MMX instruction generates exception Int 7. 

1. The operating system maintains a save area for each task (Save Areas A and B in 
Figure 4-1). 

2. It defines a variable that indicates which task “owns” the FP or MMX state. 

3. On a task switch, the OS sets the CRO.TS to 1 if the incoming task does not own the FP 
or MMX state. Otherwise, it sets it to 0. 

4. If a new task attempts to use an MMX instruction, (while CR0.TS=1), exception Int 7 is 
generated. The Int 7 handler (“owned” by the operating system) saves the FP or MMX 
state to the save area of the FP or MMX state owner and restores the FP or MMX state 
from the save area of the current task. 

5. The ownership of the FP or MMX state then changes to the current task and CR0.TS=0. 



Task Switch Code INT 7 Handler 


Figure 4-1. Example of FP and MMX State Saving in Operating System 
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4.2. EXCEPTIONS 

MMX instructions do not generate numeric exceptions or affect the processor architecture 
status flags. Previously pending floating-point numeric errors are reported. 

The MMX instructions can generate the following exceptions: 

• Memory access exceptions: 

#SS Interrupt 12 - Stack exception 
#GP Interrupt 13 - General Protection 
#PF Interrupt 14 - Page Fault 

#AC Interrupt 17 - Alignment Check, if enabled by CPU configuration. 

• System exceptions: 

#UD Interrupt 6 - Invalid Opcode 

Executing an MMX instruction when CR0.EM=1 generates an Invalid Opcode exception. 
#NM Interrupt 7 - Device not available. The TS bit in CRO is set. 

• Pending floating-point error: 

#MF Interrupt 16 - Pending floating-point error 

• Other exceptions that occur indirectly due to faulty execution of the above exceptions. 
For example: Interrupt 12 occurs due to MMX instructions, and the interrupt gate directs 
the processor to invalid TSS (task state segment). 

The MMX instructions are accessible from all operation modes of lA: Protected mode, Real 
address mode, and Virtual 8086 mode. 
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4.3. COMPATIBILITY WITH EXISTING SOFTWARE 
ENVIRONMENTS 

4.3.1. Register Aliasing 

The MMX state is aliased on the floating-point state: 

• MMX registers MM0-MM7 are aliased on the 64-bit mantissas of the floating-point 
register (See Figure 4-2). 

• A value written to an MMX register using MMX instructions also appears in one of the 
eight floating-point registers (bits 63-0). The exponent field of the corresponding 
floating-point register (bits 78-64) and its sign bit (bit 79) are set to ones (11s). 

• The mantissa of a floating-point value written to a floating-point register by floating¬ 
point instructions also appears in an MMX register. 




SYSTEM PROGRAMMING MODEL 

















SYSTEM PROGRAMMING MODEL 


int^» 

MMX registers map to the physical locations of floating-point registers. MMX register 
mapping is fixed and does not change when the TOS (Top Of Stack field in the floating-point 
status word, bits 11-13) changes. 

The value of the TOS is set to 0 after each MMX instruction. 

In the floating-point context, STn refers to the relative location of a FP register, n, to the 
TOS. However, the FP tag bits refer to the physical locations of the FP register. The MMX 
registers always refer to the physical location. 

In Figure 4-3, the inner circle refers to the physical location of the FP and MMX registers. 
The outer circle refers to FP register's relative location to the current TOS. 

When the TOS=0 (case a in Figure 4-3), STO points to the physical location 0 on the floating¬ 
point stack. MMO maps to STO, MMl maps to STl, and so on. 


When the TOS=2 (case b in Figure 4-3), STO points to the physical location 2. MMO maps to 
ST6, MMl maps to ST7, MM2 maps to STO, and so on. 



Figure 4-3. Mapping of MMX™ Registers to Floating Point Registers 
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4.3.2. The Effect of Floating-Point and Instructions on 

the Floating-Point Tag Word 

Using an MMX instruction (except EMMS) validates (sets to 00s) the entire floating-point 
tag word. 

The EMMS instruction sets the entire FP tag bits register to empty (1 Is in each tag field). 

FSAVE and FSTENV instructions read the FP tag word and store the contents of the FP tag 
word in memory. Executing these instructions calculates the precise values of the FP tag 
word fields based on the current contents of the registers. After executing these instructions, 
all tag bit values are valid for MMX instructions: Valid, Zero, Special, Empty. The value of 
the FP tag word does not affect the MMX registers or execution of MMX instructions. 

Table 4-1 summarizes the effect of FP or MMX instructions and FSAVE/ FSTENV 
instructions on the tag bit fields in an FP or MMX register and defines their value in memory. 


Table 4-1. Effect of the FP and MMX Instructions on the FP Tag Word 


Instruction Type 

Instruction 

Tag Bits 

Calculated FP Tag 
Word in Memory After 
FSAVE/FSTENV 

MMXTM 

All (except EMMS) 

All registers' tags are 
set to zeros (00). 

00, 01,10 

MMX 

EMMS 

All registers' tags are 
set to ones (11). 

11 

FP 

All (except FRSTOR, 
FLDENV) 

Individual register tag is 
set to 00 or 11. 

Each register's tags are 
set to 00,11,01 or 10. 

FP 

FRSTOR, FLDENV 

All registers' tags are 
set to 00 or 11 or 01 or 

10. 

Each register’s tags are 
set to 00,11,01 or 10. 
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4.3.2.1. ALIASING SUMMARY 

Table 4-2 summarizes the effects of the MMX instructions on the floating-point state. 


Table 4-2. Effects of MMX™ Instruction on FP State 


Instruction 

Type 

FP Tag Word 

TOS 

(SW 13 .. 11 ) 

other FP 

Environment 
(CW, Data Ptr, 
Code Ptr, Other 
Fields) 

Exponent Bits + 
Signed Bit of 
MMn (79..64) 

Mantissa Part 
of MMn ( 63 .. 00 ) 

MMX register 
read from MMX 
register (MMn) 

Aii fieids set to 

00 (Vaiid) 

000 

Unchanged 

Unchanged 

Unchanged 

MMX register 
write to MMX 
register (MMn) 

All fields set to 

00 (Valid) 

000 

Unchanged 

Set to ones (11) 

Overwritten 

EMMS 

All fields set to 

11 (Empty) 

000 

Unchanged 

Unchanged 

Unchanged 


Note; MMn refers to one MMX register. 



4.3.3. Context Switch Support 

If the task switch bit (TS) in control register 0 (CRO) is set (CR0.TS=1), the first FP or MMX 
instruction that executes will trigger Int 7, Device not available (DNA). Causing a DNA fault 
enables an operating system to save the context of the FP or MMX registers with the same 
code currently used to save the FP state. Both the FSAVE (Store IT state) and FRSTOR 
(Restore FP state) instructions are used to save and restore either the FP or MMX state. 

See Section 4.1. for more details on context switching. 


4.3.4. Floating-Point Exceptions 

When floating-point exceptions are enabled and a FP exception is pending, subsequent MMX 
instruction execution reports an FP error (Int 16 and/or FERR# signal). The pending 
exception is handled by the FP exception handler. Execution resumes at the interrupted 
MMX instruction. 

Before the MMX instruction is executed, the FP state is maintained and is visible to the FP 
exception handler. 


See Section 3.3.6 for more detail. 
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4.3.5. Debugging 

The debug features for Intel Architecture implementations operate in the same manner on the 
MMX instruction set. This enables debuggers to debug code that uses the MMX technology. 


4.3.6. Emulation of the Instruction Set 

There is no emulation support for microprocessors that support the MMX technology. 

The CRO.EM bit used to emulate floating-point instructions cannot be used in the same way 
for MMX instruction emulation. If an MMX instruction executes when the CRO.EM bit is 
set, an invalid opcode exception (Int 6) is generated. 


4.3.7. Exception handling in Operating Systems 

This section specifies system exceptions. Exception handling in MMX code is discussed in 
Section 3.3.6. 

An invalid opcode exception (Int 6) can occur due to MMX instruction execution two cases: 

• On implementations that do not support lA MMX technology. 

• When CR0.EM=1 and an MMX instruction is executed. 

The CRO.EM bit is used to emulate the FP instructions in software. In this case, the operating 
system does not save the FP hardware state on task switches and does not save the MMX 
state. An invalid opcode exception is generated to flag this event to the operating system, 
and prevent application errors from occurring. 
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CHAPTER 5 

INTEL ARCHITECTURE MMX™ INSTRUCTION 

SET 


This chapter presents the Intel Architecture MMX™ instructions in alphabetical order, with a 
full description of each instruction. 

The lA MMX technology defines fifty-seven new instructions. The instructions are grouped 
into the following functional categories: 

• Arithmetic Instructions 

• Comparison Instructions 

• Conversion Instructions 

• Logical Instructions 

• Shift Instructions 

• Data Transfer Instructions 

• Empty MMX State (EMMS) Instruction 

Appendix A summarizes the MMX instructions grouped by categories of related functions. 
Appendix B provides instruction formats and encodings, and Appendix C provides an 
alphabetical list of instruction mnemonics, their source data types, encodings in hexadecimal, 
and format. Appendix D provides an Opcode Map of the MMX instructions. 

Many of the instructions have multiple variations depending on the data types they support. 
Each variation has a different suffix. For example the PADD instruction has three variations: 
PADDB, PADDW, and PADDD, where the letters B, W, and D represent byte, word, and 
doubleword. 


5.1. INSTRUCTION SYNTAX 

Instructions vary by: 

• Data type: packed bytes, packed words, packed doublewords or quadwords 

• Signed - Unsigned numbers 

• Wraparound - Saturate arithmetic 
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A typical MMX instruction has this syntax: 

• Prefix: P for Packed 

• Instruction operation: for example - ADD, CMP, or XOR 

• Suffix: 

— US for Unsigned Saturation 

— S for Signed saturation 

— B, W, D, Q for the data type: packed byte, packed word, packed doubleword, or 
quadword. 

Instructions that have different input and output data elements have two data-type suffixes. 
For example, the conversion instruction converts from one data type to another. It has two 
suffixes: one for the original data type and the second for the converted data type. 

This is an example of an instruction mnemonic syntax : 

PADDUSW (Packed Add Unsigned with Saturation for Word) 

P = Packed 

ADD = the instruction operation 

US = Unsigned Saturation 

W = Word 

5.2. INSTRUCTION FORMAT 

The lA MMX instructions use the existing lA instruction format. All instructions, except the 
EMMS instruction, use the ModR/M format. All are preceded by the OF prefix byte. For 
more details about the ModR/M format refer to Pentium® Processor Family Developer’s 
Manual Volume 3, Section 25.2.1. 

For data-transfer instructions, the destination and source operands can reside in memory, 
integer registers, or MMX registers. For all other lA MMX instructions, the destination 
operands reside in MMX registers, and the source operands reside in memory, MMX 
registers, or immediate operands. 

All existing address modes are supported using the SIB (Scale Index Base) format. 
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5.3. NOTATIONAL CONVENTIONS 

The following conventions apply to all MMX instructions (except the EMMS instruction): 

• The instructions reference and operate on two operands: the source and destination 
operands. The right operand is the source and the left operand is the destination. The 
destination operand may also supply one of the inputs for the operation. The instruction 
overwrites the destination operand with the result. 

• When one of the operands is a memory location, the linear address corresponds to the 
address of the least significant byte of the referenced memory data. 

• The MMX instructions do not affect the condition flags. 
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5.4. HOW TO READ THE INSTRUCTION SET PAGES 

The following is an example of the format used for each MMX instruction description in this 
chapter: 


PSLL—Packed Shift Left Logical 


Opcode 

Instruction 

Description 

OF F1 /r 

PSLLW mm, mm/m64 

Shift all words in MMX register to left by an amount 
specified in MMX register/memory, while shifting in 



zeros. 


The above table gives the instruction mnemonic and a brief description of the mnemonic. 
The columns content are explained below. 


Opcode Column 

The "Opcode" column provides the complete opcode produced for each form of the 

instruction. 

The codes are defined as hexadecimal bytes, in the same order in which they appear in 

memory. Definitions of entries other than hexadecimal bytes are as follows: 

• /digit: (digit is between 0 and 7) indicates that the ModR/M byte of the instruction 
uses only the r/m (register or memory) operand. The reg field contains the digit that 
provides a technology to the instruction’s opcode. 

• /r: indicates that the ModR/M byte of the instruction contains both a register 
operand and an r/m operand. 

• ib: a 1-byte, immediate operand to the instruction that follows the opcode, ModR/M 
bytes, and scale-indexing bytes. The opcode determines if the operand is a signed 
value. 

Instruction Column 

The "Instruction" column contains the instruction syntax. The following is a list of the 

symbols used to represent operands in the instruction statements: 

• immS: an immediate byte value, imm8 is a signed number between -128 and +127 
inclusive. 

• r/m32: a doubleword register or memory operand used for instructions whose 
operand-size attribute is 32 bits. 
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• min/m32: indicates the lowest 32 bits of an MMX register or a 32-bit memory 
location. 

• mm/m64: indicates a 64-bit MMX register or a 64-bit memory location. 

Description Column 

The "Description" column briefly explains the instruction activity. 

Operation 

The "Operation" section contains an algorithmic description of the operation performed by 
the instruction. 

The register name or memory location implies the contents of the register or memory. 

The bit values are written from high-order to low-order and indicate the address within the 
register or memory. The bit addresses are specified along with the register name or memory 
location in brackets. For example mm(7..0) represents the low-order 8 bits in an MMX 
register. 

The algorithms are composed of the following elements: 

• Comments are enclosed with the symbol pairs “(*” and 

• Compound statements are enclosed between the keywords of the “if’ statement (IF, 
THEN, ELSE). 

• A B; indicates that the value of B is assigned to A. 

• The symbols =, <>, >, <, >, and < are relational operators used to compare two values, 
meaning equal, not equal, greater or equal, less or equal, respectively. A relational 
expression such as A=B is TRUE if the value for A is equal to B; otherwise it is FALSE. 

The following functions are used in the algorithmic descriptions: 

• ZeroExtend (value) returns a value zero-extended to the operand-size attribute of the 
instruction. For example, if OperandSize = 32, ZeroExtend of a byte value of -10 
converts the byte from F6H to doubleword with hexadecimal value OOOOOOF6H. If the 
value passed to ZeroExtend and the operand-size attribute are the same size, ZeroExtend 
returns the value unaltered. 

• SignExtend (value) returns a value sign-extended to the operand-size attribute of the 
instruction. For example, if OperandSize = 32, SignExtend of a byte containing the 
value -10 converts the byte from F6H to doubleword with hexadecimal value 
FFFFFFF6H. If the value passed to SignExtend and the operand-size attribute are the 
same size, SignExtend returns the value unaltered. 
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• SaturateSignedWordToSignedByte converts a signed 16-bit value to a signed 8-bit 
value. If the signed 16-bit value is less than -128, it is represented by the saturated value 
-128 (0x80). If it is greater than 127, it is represented by the saturated value 127 (0x7F). 

• SaturateSignedDwordToSignedWord converts a signed 32-bit value to a signed 16-bit 
value. If the signed 32-bit value is less than -32768, it is represented by the saturated 
value -32768 (0x8000). If it is greater than 32767, it is represented by the saturated value 
32767 (0x7FFF). 

• SaturateSignedWordToUnsignedByte converts a signed 16-bit value to an unsigned 
8-bit value. If the signed 16-bit value is less than zero it is represented by the saturated 
value zero (OxOO).If it is greater than 255 it is represented by the saturated value 255 
(OxFF). 

• SaturateToSignedByte represents the result of an operation as a signed 8-bit value. If 
the result is less than -128, it is represented by the saturated value -128 (0x80). If it is 
greater than 127, it is represented by the saturated value 127 (0x7F). 

• SaturateToSignedWord represents the result of an operation as a signed 16-bit value. If 
the result is less than -32768, it is represented by the saturated value -32768 (0x8000).lf 
it is greater than 32767, it is represented by the saturated value 32767 (0x7FFF). 

• SaturateToUnsignedByte represents the result of an operation as a signed 8-bit value. If 
the result is less than zero it is represented by the saturated value zero (0x00). If it is 
greater than 255, it is represented by the saturated value 255 (OxFF). 

• SaturateToUnsignedWord represents the result of an operation as a signed 16-bit value. 
If the result is less than zero it is represented by the saturated value zero (OxOO).I If it is 
greater than 65535, it is represented by the saturated value 65535 (OxFFFF). 

Description 

The "Description" section describes the operation for all variations of the instruction. 

Example 

The “Example” section contains a graphical representation of the instruction's functional 

behavior. 

Exceptions 

The "Exceptions" section lists the exceptions in the three different modes: Protected mode. 

Real Address mode, and Virtual-8086 mode. 
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Refer to Section 4.2 of this document for more detail on these exceptions. See also the 
Pentium® Processor Family Manual, Volume 3, Section 9.4 and Chapter 14. 
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EMMS—Empty MMX^m state 



Instruction 

Description 


EMMS 

Set the FP taq word to emotv. 


Operation 

TW <- OxFFFF; 

Description 

The EMMS instruction sets the values of the floating-point (FP) tag word to empty (all ones). 
EMMS marks the registers as available, so they can subsequently be used by floating-point 
instructions. 

If a floating-point instruction loads into one of the registers before it has been reset by the 
EMMS instruction, a floating-point stack overflow can occur, which results in an FP 
exception or incorrect result. 

All other MMX instructions validate the entire FP tag word (all zeros). 

NOTE 

This instruction must be used to clear the MMX state at the end of all MMX 
routines, and before calling other routines that may execute floating-point 
instructions. 

Figure 5-1 shows the format of the FP Tag Word. 


15 0 


I 

FP tag(7) 

I 

FP tag(6) 

I 

FP tag(5) 

I 

FP tag(4) 

I 

FP tag(3) 

I 

FPtag(2) 

I 

FPtag(1) 

I 

FPtag(O) 


Tag values: 00 = Valid 10 = Valid 

01 = Valid 11 = Empty 

3006047 


Figure 5-1. Floating Point Tag Word Format 


Flags Affected 

None. 


I 


5-8 
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Protected Mode Exceptions 

#UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

#UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Virtual 8086 Mode Exceptions 

#UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 
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MOVD—Move 32 Bits 


Opcode 

Instruction 

Description 

0F6E/r 

MOVD mm, r/m32 

Move 32 bits from integer register/memory to MMX register. 

0F7E/r 

MOVD r/m32, mm 

Move 32 bits from MMX register to integer register/memorv. 


Operation 

IF destination = mm 
THEN 

mm(63..0) <- ZeroExtend(r/m32); 

ELSE 

r/m32 ^ mm(31..0): 

Description 

The MOVD instruction copies 32 bits from the source operand to the destination operand. 

The destination and source operands can be either MMX registers, 32-bit memory operands, 
or 32-bit integer registers. The MOVD cannot transfer data from an MMX register to an 
MMX register, from memory to memory, or from an integer register to an integer register. 

When the destination operand is an MMX register, the 32-bit source operand is written to the 
low-order 32 bits of the 64-bit destination register. The destination register is zero-extended 
to 64 bits. 

When the source operand is an MMX register, the low-order 32 bits of the MMX register are 
written to the 32-bit integer register or 32-bit memory location. 
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Example 



Flags Affected 

None. 

Protected Mode Exceptions 

#GP(0) if the destination is in a nonwritable segment; #GP(0) for an illegal memory operand 
effective address in the CS, DS, ES, FS, or GS segments; #SS(0) for an illegal address in the 
SS segment; #PF(fault-code) for a page fault; #AC for unaligned memory reference if the 
current privilege level is 3; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is 
a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 





INTEL ARCHITECTURE MMX™ INSTRUCTION SET 



MOVQ—Move 64 Bits 


Opcode 

Instruction 

Description 


0F6F/r 

MOVQ mm, mm/m64 

Move 64 bits from MMX register/memory to MMX register. 


0F7F/r 

MOVQ mm/m64, mm 

Move 64 bits from MMX register to MMX reqister/memorv. 

- 


Operation 

IF destination = mm 
THEN 

mm <- mm/m64: 
ELSE 

mm/m64 mm; 


Description 

The MOVQ instruction copies 64 bits from the source operand to the destination operand. 

The destination and source operands can be either MMX registers or 64-bit memory 
operands. The MOVQ instruction cannot transfer data from memory to memory. 

When the destination is an MMX register and the source is a 64-bit memory operand, the 
64 bits of data at the memory location are copied into the MMX register. 

When the destination is a 64-bit memory operand and the source is an MMX register, the 
64 bits of data are copied from the MMX register into the memory location. 

When the destination and source are both MMX registers, the contents of the MMX register 
(source) are copied into an MMX register (destination). 


Example 


MOVQ mm, m64 

-► 

63 

4847 

32 31 

1615 

0 

15 ^0 


b7 

b6 

b5 

b4 

b3 

b2 

b1 

bO 


b7 

b6 

Wn.3 


mm 

b5 

b4 

Wn.2 



b3 

b2 

Wn.i 



b1 

bO 

Wn.o^ 

— m64 
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Flags Affected 

None. 

Protected Mode Exceptions 

#GP(0) if the destination is in a nonwritable segment; #GP(0) for an illegal memory operand 
effective address in the CS, DS, ES, FS, or GS segments; #SS(0) for an illegal address in the 
SS segment; #PF(fault-code) for a page fault; #AC for unaligned memory reference if the 
current privilege level is 3; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is 
a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PACKSSWB /PACKSSDW—Pack with Signed Saturation 


Opcode 

Instruction 

Description 

OF 63 /r 

PACKSSWB mm, mm/m64 

Pack and saturate signed words from MMX register and MMX 
register/memory into signed bytes in MMX register. 

OF 6B /r 

PACKSSDW mm, mm/m64 

Pack and saturate signed dwords from MMX register and MMX 
reqister/memorv into signed words in MMX register. 


Operation 

IF instruction is PACKSSWB 
THEN{ 

nnm(7..0) <- SaturateSignedWordToSignedByte mm(15..0); 
mm(15..8) <- SaturateSignedWordToSignedByte mm(31..16): 
mm(23..16) <- SaturateSignedWordToSignedByte mm(47..32): 
mm(31..24) <- SaturateSignedWordToSignedByte mm(63..48): 
mm(39..32) <- SaturateSignedWordToSignedByte mm/m64(15..0); 
mm(47..40) <- SaturateSignedWordToSignedByte mnn/m64(31..16): 
mm(55..48) <- SaturateSignedWordToSignedByte mm/m64(47..32); 
mm(63..56) <- SaturateSignedWordToSignedByte mm/m64(63..48); 

} 

ELSE {(* instruction is PACKSSDW *) 
mm(15..0) <- SaturateSignedDwordToSignedWord mm(31..0); 
mm(31 ..16) <- SaturateSignedDwordToSignedWord mm(63..32); 
mm(47..32) <- SaturateSignedDwordToSignedWord mm/m64(31..0); 
mm(63..48) <- SaturateSignedDwordToSignedWord mm/m64(63..32): 

} 

Description 

The PACKSS instruction packs and saturates the signed data elements from the source and 
the destination operands and writes the signed results to the destination operand. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 

The PACKSSWB instruction packs four signed words from the source operand and four 
signed words from the destination operand into eight signed bytes in the destination register. 
If the signed value of a word is larger or smaller than the range of a signed byte, the value is 
saturated (in the case of an overflow - to 0x7F, and in the case of an underflow - to 0x80). 

The PACKSSDW instruction packs two signed doublewords from the source operand and 
two signed doublewords from the destination operand into four signed words in the 
destination register. If the signed value of a doubleword is larger or smaller than the range of 
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a signed word, the value is saturated (in the case of an overflow - to 0x7FFF, and in the case 
of an underflow - to 0x8000). 


Example 



Flags Affected 

None. 

Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PACKUSWB—Pack with Unsigned Saturation 


Opcode 

Instruction 

Description 

OF 67 /r 

PACKUSWB mm, mm/m64 

Pack and saturate signed words from MMX register and MMX 
reqister/memorv into unsigned bytes in MMX register. 


Operation 

mm(7..0) <- SaturateSignedWordToUnsignedByte mm(15..0); 
mm(15..8)<- SaturateSignedWordToUnsignedByte mm(31..15); 
mm(23..16) <- SaturateSignedWordToUnsignedByte mm(47..32): 
mm(31..24) <- SaturateSignedWordToUnsignedByte mm(63..48); 
mm(39..32\ <- SaturateSignedWordToUnsignedByte mm/m64(15..0): 
mm(47..40) SaturateSignedWordToUnsignedByte mm/m64(31..16): 
mm(55..48) <- SaturateSignedWordToUnsignedByte nnm/m64(47..32); 
mm(63..56j <- SaturateSignedWordToUnsignedByte nnm/m64(63..48); 

Description: 

The PACKUSWB packs and saturates four signed words of the source operand and four 
signed words of the destination operand into eight unsigned bytes. The result is written to the 
destination operand 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 

If the signed value of the word is larger or smaller than the range of an unsigned byte, the 
value is saturated (in the case of an overflow - to OxFP and in the case of an underflow - to 
0x00). 
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Example 



Flags Affected 

None. 

Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PADDB/PADDW/PADDD—Packed Add 


Opcode 

Instruction 

Description 

OF FC /r 

PADDB mm, mm/m64 

Add packed byte from MMX register/memory to packed byte in 

MMX register. 

OF FD /r 

PADDW mm, mm/m64 

Add packed word from MMX register/memory to packed word in 

MMX register. 

OF FE/r 

PADDD mm, mm/m64 

Add packed dword from MMX register/memory to packed dword in 
MMX register. 


Operation 

IF instruction is PADDB 
THEN { 

mm(7..0) <r- mm(7..0) + mm/m64(7..0): 
nnm(15..8) <- mm(15..8) + mm/m64(15..8); 
mm(23..16) <- mm(23..16)+ mm/m64(23..16): 
mm(31..24) <- mm(31..24) + mm/m64(31 ..24); 
mm(39..32) mm(39..32) + mm/m64(39..32); 

mm(47..40) <- mm(47..40)+ mm/m64(47..40); 
mm(55..48) <- mm(55..48) + mm/m64(55..48); 
mm(63..56) ^ mm(63..56) + mm/m64(63..56); 
} 

IF Instruction is PADDW 
THEN{ 

mm(15..0) <- mm(15..0) + mm/m64(15..0); 
mm(31..16) <- mm(31..16\ + mm/m64(31 ..16); 
mm(47..32) <- mm(47..32\ + mm/m64(47..32); 
mm(63..48) <- mm(63..48) + mm/m64(63..48); 
} 

ELSE {(* instruction is PADDD *) 
mm(31 ..0) <- mm(31 ..0) + mm/m64(31 ..0); 
mm(63..32) <- mm(63..32) + mm/m64(63..32); 
} 


Description 

The PADD instructions add the data elements of the source operand to the data elements of 
the destination register. The result is written to the destination register. If the result exceeds 
the data-range limit for the data type, it wraps around. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 
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The PADDB instruction adds the bytes of the source operand to the bytes of the destination 
operand and writes the results to the MMX register. When the result is too large to be 
represented in a packed byte (overflow), the result wraps around and the lower 8 bits are 
written to the destination register. 

The PADDW instruction adds the words of the source operand to the words of the destination 
operand and writes the results to the MMX register. When the result is too large to be 
represented in a packed word (overflow), the result wraps around and the lower 16 bits are 
written to the destination register. 

The PADDD instruction adds the doublewords of the source operand to the doublewords of 
the destination operand and writes the results to the MMX register. When the result is too 
. large to be represented in a packed doubleword (overflow), the result wraps around and the 
lower 32 bits are written to the destination register. 



Example 


PADDW mm, mm/m64 


mm 


mm/m64 


mm 




1000000000000000 

0111111100111000 

+ 

+ 

+ 

+ 



1111111111111111 

0001011100000111 

1 i 1 1 



0111111111111111 

1001011000111111 


3006015 


Flags Affected 

None. 
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Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PADDSB/PADDSW—Packed Add with Saturation 


Opcode 

Instruction 

Description 

OF EC /r 

PADDSB mm, mm/m64 

Add signed packed byte from MMX register/memory to signed 
packed byte in MMX register and saturate. 

OF ED/r 

PADDSW mm, mm/m64 

Add signed packed word from MMX register/memory to signed 
packed word in MMX register and saturate. 


Operation 

IF instruction is PADDSB 
THEN{ 

mm(7..0) <r- SaturateToSignedByte (mm(7..0) + mm/m64 (7..0)) ; 
mm(15..8) SaturateToSignedByte (mm(15..8) + mm/m64(15..8)); 
mm(23..16) SaturateToSignedByte (mm(23.,16)+ mm/m64(23..16)); 
mm(31 ..24) <r- SaturateToSignedByte (mm(31 ..24) + mm/m64(31 ..24)); 
mm(39..32) SaturateToSignedByte (mm(39..32) + mm/m64(39..32) ); 
mm(47..40) <- SaturateToSignedByte (mm(47..40)+ mm/m64(47..40)); 
mm(55..48) <- SaturateToSignedByte (mm(55..48) + mm/m64(55..48)); 
mm(63..56) <- SaturateToSignedByte (mm(63..56) + mm/nn64(63..56)); 

} 

ELSE {(* instruction is PADDSW *) 
mm(15..0) <- SaturateToSignedWord (mm(15..0) + mm/m64(15..0)); 
mm(31..16) SaturateToSignedWord (mm(31..16) + mm/m64(31..16)); 
mm(47..32) <- SaturateToSignedWord (mm(47..32) + mm/m64(47..32)); 
mm(63..48) <- SaturateToSignedWord {mm(63..48) + mm/m64(63..48)); 

} 

Description 

The PADDS instructions add the packed signed data elements of the source operand to the 
packed signed data elements of the destination operand and saturate the result. The result is 
written to the destination operand. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 

The PADDSB instruction adds the signed bytes of the source operand to the signed bytes of 
the destination operand and writes the results to the MMX register. If the result is larger or 
smaller than the range of a signed byte, the value is saturated (in the case of an overflow - to 
0x7F, and in the case of an underflow - to 0x80). 

The PADDSW instruction adds the signed words of the source operand to the signed words 
of the destination operand and writes the results to the MMX register. If the result is larger or 
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smaller than the range of a signed word, the value is saturated (in the case of an overflow - to 
OxVFFF, and in the case of an underflow - to 0x8000). 


Example 


PADDSW mm, mm/m64 


mm 


mm/m64 


mm 




1000000000000000 

0111111100111000 I 

+ 

+ 

+ 

+ 



1111111111111111 

0001011100000111 

1 1 i 



1000000000000000 

0111111111111111 


3006016 


Flags Affected 

None. 

Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PADDUSB/PADDUSW—Packed Add Unsigned with 
Saturation 


Opcode 

Instruction 

Description 

OF DC /r 

PADDUSB mm, mm/m64 

Add unsigned packed byte from MMX register/memory to unsigned 
packed byte in MMX register and saturate. 

OF DD /r 

PADDUSW mm, mm/m64 

Add unsigned packed word from MMX register/memory to unsigned 
packed word in MMX register and saturate. 


Operation 

IF instruction is PADDUSB 

THEN{ 

mm(7..0) <- SaturateToUnsignedByte (mm(7..0) + mm/m64 (7..0)); 
mm(15..8) <- SaturateToUnsignedByte (mm(15..8) + mm/m64(15..8)); 
mm(23..16) <- SaturateToUnsignedByte (mm(23..16)+ mm/m64(23..16)); 
mm(31 ..24) <- SaturateToUnsignedByte (mm(31 ..24) + mm/m64(31 ..24)); 
mm(39..32) SaturateToUnsignedByte (mm(39..32) + mm/m64(39..32)); 
mm(47..40) ^ SaturateToUnsignedByte (mm(47..40)+ nnm/m64(47..40)); 
mm(55..48) <- SaturateToUnsignedByte (mm(55..48) + mm/m64(55..48)); 
mm(63..56) ^ SaturateToUnsignedByte (mm(63..56) + mnn/nn64(63..56)); 
} 

ELSE {(* instruction is PADDUSW *) 
mm(15..0) <- SaturateToUnsignedWord (mm(15..0) + mm/m64(15..0)); 
mm(31..16) <-SaturateToUnsignedWord (mm(31..16j + mm/m64(31..16)) 
mm(47..32) <- SaturateToUnsignedWord (mm{47..32\ + mm/m64(47..32)) 
mm(63..48) <- SaturateToUnsignedWord (mm(63..48) + mm/m64(63..48)) 
} 


Description 

The PADDUS instructions add the packed unsigned data elements of the source operand to 
the packed unsigned data elements of the destination operand and saturate the results. The 
results are written to the destination operand. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 

The PADDUSB instruction adds the unsigned bytes of the source operand to the unsigned 
bytes of the destination operand and writes the results to the MMX register. When the result 
is larger than the range of an unsigned byte (overflow), the value is saturated to OxFF. When 
the result is smaller than the range of an unsigned byte (underflow), the value is saturated to 
0x00. 
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The PADDUSW instruction adds the unsigned words of the source operand to the unsigned 
words of the destination operand and writes the results to the MMX register. When the result 
is larger than the range of an unsigned word (overflow), the value is saturated to OxFFFF. 
When the result is smaller than the range of an unsigned word (underflow), the value is 
saturated to 0x0000. 



Example 


PADDUSB mm, mm/m64 


mm 


mm/m64 


mm 







10000000 

01111111 

00111000 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 






11111111 

00010111 

00000111 







11111111 

10010110 

00111111 


3006017 


Flags Affected 

None. 

Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 
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Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 



INTEL ARCHITECTURE MMX™ INSTRUCTION SET 



PAND—Bitwise Logicai And 


Opcode 

Instruction 

Description 

OF DB /r 

PAND mm, mm/mG4 

AND 64 bits from MMX reqister/memorv to MMX reqister. 


Operation 

mm <-mm AND mm/m64; 

Description 

The PAND instruction performs a bitwise logical AND on 64 bits of the source and 
destination operands, and writes the result to the destination operand. 

Each bit of the result of the PAND instruction is set to 1 if the corresponding bits of the 
operands are 1. Otherwise, it is set to 0. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 


Example 



PAND mm, mm/m64 

mm 

1111111111111000000000000000010110110101100010000111011101110111 



& 

mm/m64 

0001000011011001010100000011000100011110111011110001010110010101 






mm 

0001000011011000000000000000000100010100100010000001010100010101 



3006019 


Flags Affected 

None. 

Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 

5-26 I 
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#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 



Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PANDN—Bitwise Logicai And Not 


Opcode 

Instruction 

Description 

OF DF/r 

PANDN mm, mm/m64 

Invert the 64 bits in MMX register, AND inverted MMX register 
with MMX register/memorv. 


Operation 

mm <-(NOT mm) AND mm/m64: 


Description 

The PANDN instruction performs a bitwise logical NOT on the 64 bits of the destination 
operand. The NOT inverts each of the 64 bits of the destination register so that every 1 
becomes a 0, and visa versa. 

The instruction then performs a bitwise logical AND on the inverted 64 bits of the 
destination operand and on the source operand. Each bit of the result of the AND instruction 
is set to 1 if the corresponding bits are 1. Otherwise, it is set to 0. The result is written to the 
destination register. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 


Example 


PANDN mm, mm/m64 


mm 

mm/m64 

mm 


[1111111111110000000000000000101101101010011101111000100010001000 

_ & _ 

0100100101101010110001111000110110101111100010111111011011111100 


0000000000001010110001111000010010000101100010000111011001110100 


Flags Affected 

None. 
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Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PCMPEQB/PCMPEQW/PCMPEQD—Packed Compare for 
Equal 


Opcode 

Instruction 

Description 

OF 74/r 

PCMPEQB mm, mm/m64 

Compare packed byte in MMX register/memory with packed byte in 
MMX register for equality. 

07, 75, /r 

PCMPEQW mm, mm/m64 

Compare packed word in MMX register/memory with packed word 
in MMX register for equality. 

07, 76, /r 

PCMPEQD mm, mm/m64 

Compare packed dword in MMX register/memory with packed 
dword in MMX register for equality. 


Operation 

IF instruction is PCMPEQB 
THEN{ 

IF mnn(7..0) = mm/m64(7..0) 
THEN mm(7 0) 4 - OxFF; 

ELSE mm(7..0) <- 0; 

IF nnm(15..8) = mm/m64(15., 8) 
THEN mm(15,.8) <- OxFF; 

ELSE mm(15..8)<-0: 

IF mm(63..56) = mm/m64(63..56) 
THEN mm(63..56) <- OxFF; 

ELSE mm(63..56) <- 0; 

} 

ELSE IF instruction is PCMPEQW 
THEN{ 

IF mm(15..0) = mm/m64(15..0) 
THEN mm(15..0) <- OxFFFF; 
ELSE mm(15..0) <-0; 

IF mm(31 ..16) = mm/m64(31 ..16) 
THEN mm(31 ..16) <- OxFFFF; 
ELSE mm(31..16) <-0; 

IF mm(63..48) = mm/m64(63..48) 
THEN mm(63..48) <- OxFFFF; 
ELSE mm(63..48) <- 0; 

} 
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ELSE { (* instruction is PCMPEQD *) 

IF mm(31..0) = mm/m64(31..0) 

THEN mm(31..0) OxFFFFFFFF; 

ELSE mm(31..0) 0; 

IF mm(63..32) = mm/m64(63..32) 

THEN mm(63..32) f-OxFFFFFFFF; 

ELSE mm(63..32) 0; 

} 

Description 

The PCMPEQ instructions compare the data elements in the destination operand to the 
corresponding data elements in the source operand. If the data elements are equal, the 
corresponding data element in the destination register is set to all ones. If they are not equal, 
the corresponding data element is set to all zeros. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 

The PCMPEQB instruction compares the bytes in the destination operand to the bytes in the 
source operand. The bytes in the destination operand are set accordingly. 

The PCMPEQW instruction compares the words in the destination operand to the words in 
the source operand. The words in the destination operand are set accordingly. 

The PCMPEQD instruction compares the doublewords in the destination operand to the 
doublewords in the source operand. The doublewords in the destination operand are set 
accordingly. 
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Example 



PCMPEQW mm, mm/m64 



mm 

0000000000000000 

0000000000000001 

0000000000000111 

0111000111000111 



== 

== 


== 

mm/m64 

0000000000000000 

0000000000000000 

0111000111000111 

0111000111000111 



True 

False 

False 

4 - 

True 

mm 

1111111111111111 

0000000000000000 

0000000000000000 

1111111111111111 






3006020 


Flags Affected 

None: 

Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PCMPGTB/PCMPGTW/PCMPGTD—Packed Compare for 
Greater Than 


Opcode 

Instruction 

Description 

OF 64 /r 

PCMPGTB mm, mm/m64 

Compare packed byte in MMX register with packed byte in MMX 
register/memory for greater value. 

OF 65/r 

PCMPGTW mm, mm/m64 

Compare packed word in MMX register with packed word in MMX 
register/memory for greater value. 

OF 66 /r 

PCMPGTD mm, mm/m64 

Compare packed dword in MMX register with packed dword in MMX 
reqister/memorv for greater value. 


Operation 

IF instruction is PCMPGTB 
THEN{ 

IF mm(7..0) > mm/m64(7.,0) 
THEN mm(7 0) <r- OxFF; 

ELSE mm(7..0) <- 0; 

IF mm(15..8) > mm/m64(15.. 8) 
THEN mm(15..8) 4- OxFF; 

ELSE mm(15..8) «-0: 

IF mm(63..56) > mm/m64(63..56) 
THEN mm(63..56) <— OxFF; 

ELSE mm(63..56) <- 0; 

} 

ELSE IF instruction is PCMPGTW 
THEN { 

IF mm(15..0) > mm/m64(15..0) 
THEN mm(15..0) <- OxFFFF; 
ELSE mm(15..0) <-0; 

IF nnm(31..16) > mm/m64(31..16) 
THEN mm(31 ..16) OxFFFF; 
ELSE mm(31..16) <- 0; 

IF mm(63..48) > mm/m64(63..48) 
THEN mm(63..48) <- OxFFFF; 
ELSE mm(63..48) <- 0; 

} 

ELSE {(* instruction is PCMPGTD *) 
IF mm(31..0) > mm/m64(31..0) 
THEN mm(31 ..0) <- OxFFFFFFFF; 
ELSE mm(31..0) 0; 

IF mm(63..32) > mnn/m64(63..32) 
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THEN mm(63..32) <- OxFFFFFFFF; 

ELSE mm(63..32) <r- 0; 

} 



Description 

The PCMPGT instructions compare the signed data elements in the destination operand to 
the signed data elements in the source operand. If the signed data elements in the destination 
register are greater than those in the source operand, the corresponding data element in the 
destination operand is set to all ones. Otherwise, it is set to all zeros. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 

The PCMPGTB instruction compares the signed bytes in the destination operand to the 
corresponding signed bytes in the source operand. The bytes in the destination register are set 
accordingly. 

The PCMPGTW instruction compares the signed words in the destination operand to the 
corresponding signed words in the source operand. The words in the destination register are 
set accordingly. 

The PCMPGTD instruction compares the signed doublewords in the destination operand to 
the corresponding signed words in the source operand. The doublewords in the destination 
register are set accordingly. 
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Example 


PCMPGTW mm, mm/m64 


0000000000000000 

0000000000000001 

0000000000000111 

0111000111000111 

> 

> 

> 

> 

0000000000000000 

0000000000000000 

0111000111000111 

0111000111000111 

False 

True 

False 

False 





0000000000000000 

1111111111111111 

0000000000000000 

0000000000000000 


Flags Affected 

None. 

Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PMADDWD—Packed Multiply and Add 


Opcode 

Instruction 

Description 

OF F5 /r 

PMADDWD mm, mm/m64 

Multiply the packed word in MMX register by the packed word in 

MMX reg/memory. Add the 32-bit results pairwise and store in MMX 
reqister as dword 


Operation 

mm(31..0) <r- mm(15..0) * mm/m64(15..0) + mm(31..16) * mm/m64(31..16): 
mm(63..32) <r- mm(47..32) * mm/m64(47..32) + mm(63..48) * mm/m64(63..48): 

Description 

The PMADDWD instruction multiplies the four signed words of the destination operand by 
the four signed words of the source operand. The result is two 32-bit doublewords. The two 
high-order words are summed and stored in the upper doubleword of the destination operand. 
The two low-order words are summed and stored in the lower doubleword of the destination 
operand. This result is written to the destination operand. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 

The PMADDWD instruction wraps around to 0x80000000 only when all four words of both 
the source and destination operands are 0x8000. 
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Example 



PMADDWD mm, mm/m64 



mm 



0111000111000111 0111000111000111 


* 

* 

* 

* 

mm/m64 



1000000000000000 

0000010000000000 




F 

' H 

F 

mm 


1100100011100011 

1001110000000000 







3006023 


Flags Affected 

None. 

Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PMULHW—Packed Multiply High 


Opcode 

Instruction 

Description 

OF E5 /r 

PMULHW mm, mm/m64 

Multiply the signed packed word in MMX register with the signed 
packed word in MMX reg/memory, then store the high-order 16 bits of 
the results in MMX reaister. 



Operation 

mm(15..0) <- (mm(15..0) * mm/m64(15..0)) (31..16); 
mm(31 ..16)<- (mm(31 ..16) * mm/m64(31 ..16)) (31 ..16); 
mm(47..32) <- (mm(47..32) * mm/m64(47..32)) (31 ..16); 
mm(63..48) <- (mm(63..48) * mm/m64(63..48)) (31..16); 


Description 

The PMULHW instruction multiplies the four signed words of the destination operand with 
the four signed words of the source operand. The high-order 16 bits of the 32-bit intermediate 
results are written to the destination operand. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 


Example 


PMULHW mm, mm/m64 


mm 


mm/m64 


mm 



0111000111000111 

0111000111000111 

* 

* 

* 

* 

1000000000000000 

0000010000000000 

High^rder 

High^rder 

High^rder 

High^rder 



1100011100011100 

0000000111000111 


3006022 


Flags Affected 

None. 
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Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PMULLW—Packed Multiply Low 


Opcode 

Instruction 

Description 

OF D5 /r 

PMULLW mm, mm/m64 

Multiply the packed word in MMX register with the packed word in 
MMX reg/memory, then store the low-order 16 bits of the results in 
MMX reqister. 


Operation 

nnm(15..0) <- (mm(15..0) * mm/m64(15..0)) (15..0); 
mnn(31..16) <- (mm(31 ..16) * mm/m64(31..16)) (15..0) 
mm(47..32) <- (mm(47..32) * mm/m64(47..32)) {15..0): 
mm(63..48) <- (mm(63..48) * mm/m64(63..48)) (15..0) 


Description 

The PMULLW instruction multiplies the four signed or unsigned words of the destination 
operand with the four signed or unsigned words of the source operand. The low-order 16 bits 
of the 32-bit intermediate results are written to the destination operand. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 

Example 




Flags Affected 

None. 

I 


5-40 
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Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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POR—Bitwise Logicai Or 


Opcode 

Instruction 

Description 

OF EB/r 

POR mm, mm/m64 

OR 64 bits from MMX req/memorv with MMX register. 


Operation 

mm <-mm OR mm/m64; 

Description 

The POR instruction performs a bitwise logical OR on 64 bits of the destination and source 
operands, and writes the result to the destination register. 

Each bit of the result is set to 0 if the corresponding bits of the two operands are 0, 
Otherwise, the bit is 1. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 


Example 



POR mm, mm/m64 


mm 

1111111111111000000000000000010110110101100010000111011101110111 





mm/m64 

0001000011011001010100000011000100011110111011110001010110010101 



> 

r 

mm 

1111111111111001010100000011010110111111111011110111011111110111 





3006024 


Flags Affected 

None. 
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Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PSLLW/PSLLD/PSLLQ—Packed Shift Left Logical 


Opcode 

Instruction 

Description 

OF F1 /r 

PSLLW mm, mm/m64 

Shift words in MMX register left by amount specified in MMX 
reg/memory, while shifting in zeros. 

OF 71 /6. ib 

PSLLW mm, immS 

Shift words in MMX register left by Imm8, while shifting in zeros. 

OF F2 /r 

PSLLD mm, mm/m64 

Shift dwords in MMX register left by amount specified in MMX 
reg/memory, while shifting in zeros. 

0F72/6ib 

PSLLD mm, immd 

Shift dwords in MMX register by ImmS, while shifting in zeros.. 

OF F3 /r 

PSLLQ mm, mm/m64 

Shift MMX register left by amount specified in MMX reg/memory, 
while shifting in zeros. 

OF 73 /6 ib 

PSLLQ mm, imm8 

Shift MMX register left by ImmS, while shifting in zeros. 


Operation 

IF the second operand is imm8 
THEN 

temp <- imm8; 

ELSE (* second operand is mm/m64 *) 
temp <- mm/m64; 

IF instruction is PSLLW 
THEN{ 

mm(15..0) <- mm(15..0) « temp; 
mm(31..16) <- mm(31..16) «temp; 
mm(47..32) <- mm(47..32) «temp; 
mm(63..48) <- mm(63..48) « temp; 
} 

ELSE IF instruction is PSLLD 
THEN{ 

mm(31 ..0) <- mm(31 ..0) «temp; 
mm(63..32) <- mm(63..32) « temp; 
} 

ELSE (* instruction is PSLLQ *) 
mm <- mm « temp; 


Description 

The PSLL instructions shift the bits of the first operand to the left by the amount of bits 
specified in the count operand. The result of the shift operation is written to the destination 
register. The empty low-order bits are cleared (set to zero). If the value specified by the 
second operand is greater than 15 (for words), 31 (for doublewords), or 63 (for quadwords), 
then the destination is set to all zeros. 

The destination operand is an MMX register. The count operand (source operand) can be 
either an MMX register, a 64-bit memory operand, or an immediate 8-bit operand. 
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The PSLLW instruction shifts each of the four words of the destination register to the left by 
the number of bits specified in the count operand. The low-order bit positions (of each word) 
are filled with zeros. 

The PSLLD instruction shifts each of the two doublewords of the destination register to the 
left by the number of bits specified in the count operand. The low-order bit positions (of each 
doubleword) are filled with zeros. 

The PSLLQ instruction shifts the 64-bit quadword in the destination register to the left by the 
number of bits specified in the count operand. The low-order bit positions are filled with 
zeros. 



Example 



Flags Affected 

None. 


Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 



5-45 
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Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PSRAW/PSRAD—Packed Shift Right Arithmetic 


Opcode 

Instruction 

Description 

OF E1 /r 

PSRAW mm, mm/m64 

Shift words in MMX register right by amount specified in MMX 
reg/memory while shifting in sign bits. 

OF 71 M ib 

PSRAW mm, immS 

Shift words in MMX register right by ImmS while shifting in sign bits 

OF E2 /r 

PSRAD mm, mm/m64 

Shift dwords in MMX register right by amount specified in MMX 
reg/memory while shifting in sign bits. 

OF 72 /4 ib 

PSRAD mm, immS 

Shift dwords in MMX register riqht by ImmS while shiftinq in sign bits. 


Operation 

IF the second operand is imm8 
THEN 

temp <r- immS; 

ELSE (* second operand is mm/m64 *) 
temp <- mm/m64: 

IF instruction is PSRAW 
THEN{ 

mm(15..0) <- SignExtend (mm(15..0) »temp); 
mm(31..16) 4- SignExtend (mm(31..16) »temp); 
mm(47..32) 4 - SignExtend (mm(47..32) »temp); 
mm(63..48) 4 - SignExtend (mm(63..48) »temp); 
} 

ELSE {(‘instruction is PSRAD *) 
mm(31..0) <- SignExtend (mm(31..0) »temp); 
mm(63..32) 4- SignExtend (mm(63..32) » temp); 
} 


Description 

The PSRA instructions shift the bits of the first operand to the right by the amount of bits 
specified in the count operand. The result of the shift operation is written to the destination 
register. The empty high-order bits of each element are filled with the initial value of the sign 
bit of the data element. If the value specified by the second operand is greater than 15 (for 
words), or 31 (for doublewords), each destination element is filled with the initial value of 
the sign bit of the element. 

The destination operand is an MMX register. The count operand (source operand) can be 
either an MMX register, a 64-bit memory operand, or an immediate 8-bit operand. 
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The PSRAW instruction shifts each of the four words in the destination register to the right 
by the number of bits specified in the count operand. The initial value of the sign bit of the 
data elements in the destination operand is copied into the most significant bits of the data 
element. 

The PSRAD instruction shifts each of the two doublewords in the destination register to the 
right by the number of bits specified in the count operand. The initial value of the sign bit of 
the data elements in the destination operand is copied into the most significant bits of the 
data element. 

Example 


PSRAW mm, 2 


mm 




1111111111111100 1101000111000111 


shift right 

shift right 

shift right 

shift right 



f 


r 


r 


r 

mm 1 




1111111111111111 1111010001110001 


Flags Affected 

None. 


Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 












INTEL ARCHITECTURE MMX^m INSTRUCTION SET 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFPPFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 



Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PSRLW/PSRLD/PSRLQ—Packed Shift Right Logical 


Opcode 

Instruction 

Description 

OF D1 /r 

PSRLW mm, mm/m64 

Shift words in MMX register right by amount specified in MMX 
reg/memory while shifting in zeros. 

OF 71 /2 ib 

PSRLW mm, imm8 

Shift words in MMX register right by ImmS. 

OF D2 /r 

PSRLD mm, mm/m64 

Shift dwords in MMX register right by amount specified in MMX 
reg/memory while shifting in zeros. 

OF 72/2 ib 

PSRLD mm, immS 

Shift dwords in MMX register right by ImmS . 

OF D3/r 

PSRLQ mm, mm/m64 

Shift MMX register right by amount specified in MMX reg/memory while 
shifting in zeros. 

OF 73/2 ib 

PSRLQ mm, imm8 

Shift MMX register right by ImmS while shifting in zeros. 


Operation 

IF the second operand is imm8 
THEN 

temp <- imm8; 

ELSE (* second operand is mm/m64 *) 
temp <- mm/m64; 

IF instruction is PSRLW 
THEN{ 

mm(15..0) mm(15..0) » temp; 

mm(31 ..16) mm(31 ..16) »temp; 

mm(47..32) <- mm(47..32) »temp; 
mm(63..48) <- mm(63..48) »temp; 
} 

ELSE IF instruction is PSRLD 
THEN{ 

mm(31 ..0) <- mm(31 ..0) »temp; 
mm(63..32) <- mm(63..32) »temp; 
} 

ELSE (* instruction is PSRLQ *) 
mm <r- mm » temp; 


Description 

The PSRL instructions shift the bits of the first operand to the right by the amount of bits 
specified in the count operand. The result of the shift operation is written to the destination 
register. The empty high-order bits are cleared (set to zero). If the value specified by the 
second operand is greater than 15 (for words), or 31 (for doublewords), or 63 (for 
quadwords), then the destination is set to all zeros. 

The destination operand is an MMX register. The count operand (source operand) can be 
either an MMX register, a 64-bit memory operand, or an immediate 8-bit operand. 
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The PSRLW instruction shifts each of the four words in the destination register to the right 
by the number of bits specified in the count operand. The empty high-order bits (of each 
word) are filled with zeros. 

The PSLLD instruction shifts each of the two doublewords in the destination register to the 
right by the number of bits specified in the count operand. The empty high-order bits (of 
each doubleword) are filled with zeros. 

The PSLLQ instruction shifts the 64-bit quadword in the destination register to the right by 
the number of bits specified in the count operand. The empty high-order bits are filled with 
zeros. 


Example 



Flags Affected 

None. 


Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 
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Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PSUBB/PSUBW/PSUBD—Packed Subtract 


Opcode 

Instruction 

Description 

OF F8/r 

PSUBB mm, mm/m64 

Subtract packed byte in MMX reg/memory from packed byte in 

MMX register. 

OF F9/r 

PSUBW mm, mm/m64 

Subtract packed word in MMX reg/memory from packed word in 

MMX register. 

OFFA/r 

PSUBD mm, mm/m64 

Subtract packed dword in MMX reg/memory from packed dword in 
MMX register. 


Operation 

IF instruction is PSUBB 
THEN { 

mm(7..0) <- mm(7..0) - mm/m64(7..0): 
mm(15..8) <- mm(15..8) - mm/m64(15..8); 
mm(23..16) <- mm(23..16) - mm/m64(23..16); 
mm(31..24) mm(31..24) - mm/m64(31..24); 

mm(39..32) <- mm(39..32) - mm/m64(39..32); 
mm(47..40) <- mm(47..40) - mm/m64(47..40); 
mm(55..48) <- mm(55..48) - mm/m64(55..48); 
mm(63..56) mm(63..56) - mm/m64(63..56); 

} 

IF instruction is PSUBW 
THEN{ 

mm(15..0) <- mm(15..0) - mm/m64(15..0); 
mm(31..16) mm(31..16\ - mm/m64(31..16); 

mm(47..32) <- mm(47..32j - mm/m64(47..32); 
mm(63..48) mm(63..48) - mm/m64(63..48); 

} 

ELSE {(* instruction is PSUBD *) 

mm(31..0) <- mm(31..0) - mm/m64(31..0); 
mm(63..32) <- mm(63..32) - mm/m64(63..32); 
} 


Description 

The PSUB instructions subtract the data elements of the source operand from the data 
elements of the destination operand. The result is written to the destination register. If the 
result is larger or smaller than the data-range limit for the data type, it wraps around. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 
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The PSUBB instruction subtracts the bytes of the source operand from the bytes of the 
destination operand. The result is written to the MMX register. When the result is too large 
or too small to be represented in a byte, the result wraps around and the lower 8 bits are 
written to the destination register. 

The PSUBW instruction subtracts the words of the source operand from the words of the 
destination operand. The result is written to the MMX register. When the result is too large 
or too small to be represented in a word, the result wraps around and the lower 16 bits are 
written to the destination register. 

The PSUBD instruction subtracts the doublewords of the source operand from the 
doublewords of the destination operand. The result is written to the MMX register. When the 
result is too large or too small to be represented in a doubleword, the result wraps around and 
the lower 32 bits are written to the destination register. 



Example 


PSUBW mm, mm/m64 


mm 


mm/m64 


mm 




1000000000000000 

0111111100111000 

— 

— 

— 

— 


10000000000000001 

1110100011111001 







0111111111111111 

1001011000111111 


3006028 


Flags Affected 

None. 


Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 
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Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 



Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory. 
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PSUBSB/PSUBSW—Packed Subtract with Saturation 


Opcode 

Instruction 

Description 

OF E8 /r 

PSUBSB mm, mm/m64 

Subtract signed packed byte in MMX reg/memory from signed packed 
byte in MMX register and saturate. 

OF E9 /r 

PSUBSW mm, mm/m64 

Subtract signed packed word in MMX reg/memory from signed 
packed word in MMX register and saturate. 


Operation 

IF instruction is PSUBSB 
THEN{ 

mm(7..0) 4- SaturateToSignedByte (mm{7..0) - nnm/m64 (7..0)); 
mm(15..8) 4- SaturateToSignedByte (mm(15..8) - mm/m64(15..8)); 
mnn(23..16) <- SaturateToSignedByte (mm(23..16) - mm/m64(23..16)); 
mm(31 ..24) 4- SaturateToSignedByte (mm(31 ..24) - mnn/m64(31..24)); 
mm(39..32) 4- SaturateToSignedByte (mm(39..32) - mm/nn64(39..32)); 
mm(47..40) 4- SaturateToSignedByte (mm(47..40) - mnn/m64(47..40)); 
mm(55..48) 4- SaturateToSignedByte (mm(55..48) - mm/m64(55..48)); 
mm(63..56) 4- SaturateToSignedByte (mm(63..56) - mm/m64(63..56)) 

} 

ELSE {(* instruction is PSUBSW *) 
mm(15..0) 4- SaturateToSignedWord (mm(15..0) - mm/m64(15..0)); 
mm(31..16) 4- SaturateToSignedWord (mm(31..16) - mm/m64(31..16)): 
mm(47..32) 4- SaturateToSignedWord (mm{47..32) - mm/m64(47..32)); 
mm(63..48) 4- SaturateToSignedWord (mm(63..48) - mm/m64(63..48)); 

} 

Description 

The PSUBS instructions subtract the data elements of the source operand from the data 
elements of the destination operand. The results are saturated to the limits of a signed data 
element and written to the destination operand. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 

The PSUBB instruction subtracts the signed bytes of the source operand from the signed 
bytes of the destination operand, and writes the results to the destination register. If the result 
is larger or smaller than the range of a signed byte, the value is saturated; in the case of an 
overflow - to 0x7F, and in the case of an underflow - to 0x80. 

The PSUBW instruction subtracts the signed words of the source operand from the signed 
words of the destination operand and writes the results to the destination register. If the result 
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is larger or smaller than the range of a signed word, the value is saturated; in the case of an 
overflow to 0x7FFF, and in the case of an underflow to 0x8000. 

Example 


PSUBSW mm, mm/m64 


mm 


mm/m64 


mm 




1000000000000000 

0111111100111000 

— 

— 

— 

— 



0000000000000001 

1110100011111001 







1000000000000000 

0111111111111111 


3006029 


Flags Affected 

None. 

Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PSUBUSB/PSUBSW—Packed Subtract Unsigned with 
Saturation 


Opcode 

Instruction 

Description 

OF D8 /r 

PSUBUSB mm, mm/m64 

Subtract unsigned packed byte in MMX reg/memory from unsigned 
packed byte in MMX register and saturate. 

OF D9 /r 

PSUBUSW mm, mm/m64 

Subtract unsigned packed word in MMX reg/memory from 
unsigned packed word in MMX register and saturate. 


Operation 

IF instruction is PSUBUSB 
THEN{ 

mm(7..0) <- SaturateToUnsignedByte (mm(7.,0) - mm/m64 (7..0)); 
mm(15..8) <- SaturateToUnsignedByte ( mm(15..8) - mm/m64(15..8)); 
mm(23..16) <- SaturateToUnsignedByte (mm(23..16) - mm/m64(23..16)); 
mm(31..24) <- SaturateToUnsignedByte (mm{31..24) - mm/m64(31 ..24)); 
mm(39..32) SaturateToUnsignedByte (mm(39..32) - mm/m64(39..32)); 
mm(47..40) <- SaturateToUnsignedByte (mm(47..40) - mm/m64(47..40)); 
mm(55..48) <- SaturateToUnsignedByte (mm(55..48) - mm/m64(55..48)); 
nnm(63..56) <- SaturateToUnsignedByte (mm(63..56) - mm/m64(63..56)); 

} 

ELSE {(* instruction is PSUBUSW *) 
mm(15..0) <- SaturateToUnsignedWord (mm(15..0) - mm/m64(15..0)); 
mm(31..16) SaturateToUnsignedWord {mm(31..16j - mm/m64(31..16)); 
mnn(47..32) <- SaturateToUnsignedWord (mm(47..32\ - mm/m64(47..32)); 
mm(63..48) <- SaturateToUnsignedWord {mm(63..48) - mm/m64(63..48) ); 

} 

Description 

The PSUBUS instructions subtract the data elements of the source operand from the data 
elements of the destination register. The results are saturated to the limits of an unsigned data 
element and written to the destination operand. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 

The PSUBUSB instruction subtracts the bytes of the source operand from the bytes of the 
destination operand and writes the results to the destination register. If the result element is 
less than zero (a negative value), it is saturated to 0x00. 

The PSUBUSW instruction subtracts the words of the source operand from the words of the 
destination operand and writes the results to the destination register. If the result element is 
less than zero (a negative value), it is saturated to 0x0000. 
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Example 


PSUBUSB mm, mm/m64 


mm 





10000000 0111111111111000 

— 

— 

— 

— 

— 

_ _ _ 

mm/m64 





11111111 00010111 00000111 






1 i i 

mm 





OOOOOOOOl 01101000 11110001 


Flags Affected 

None. 

Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PUNPCKHBW/PUNPCKHWD/PUNPCKHDQ—Unpack High 
Packed Data 


Opcode 

Instruction 

Description 

OF 68/r 

PUNPCKHBW mm, mm/m64 

Interleave bytes from the high halves of MMX register and MMX 
reg/memory into MMX register. 

OF 69 /r 

PUNPCKHWD mm, mm/m64 

Interleave words from the high halves of MMX register and MMX 
reg/memory into MMX register. 

OF 6A/r 

PUNPCKHDQ mm, mm/m64 

Interleave dwords from the high halves of MMX register and MMX 
req/memory into MMX register. 


Operation 

IF instruction is PUNPCKHBW 
THEN{ 

mm(63..56) <- mm/m64(63..56); 
nnm(55..48) 4 - mm(63..56); 
mm(47..40) <- mm/m64(55..48); 
mm(39..32) 4- mm(55..48); 
mm(31 ..24) 4- mm/m64(47..40): 
mm(23..16)4-mm(47..40): 
mm(15..8) 4- mm/m64(39..32); 
mm(7..0) 4- mm(39..32); 

ELSE IF instruction is PUNPCKHW 
THEN{ 

mm(63..48) 4- mm/m64(63..48); 
mm(47..32) 4- mm(63..48): 
mm(31..16) 4- mm/m64(47..32): 
mm(15..0) 4- mm(47..32); 

} 

ELSE {(* instruction is PUNPCKHDQ *) 
mm(63..32) 4- mm/m64(63..32); 
mm(31..0) 4- mm(63..32) 

} 


Description 

The PUNPCKH instructions unpack and interleave the high-order data elements of the 
destination and source operands into the destination operand. The low-order data elements 
are ignored. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 64-bit memory operand. 
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When unpacking from a memory operand, the full 64-bit operand is accessed from memory. 
The instruction uses only the high-order 32 bits. 

The PUNPCKHBW instruction interleaves the four high-order bytes of the source operand 
and the four high-order bytes of the destination operand and writes them to the MMX 
register. 

The PUNPCKHWD instruction interleaves the two high-order words of the source operand 
and the two high-order words of the destination operand and writes them to the MMX 
register. 

The PUNPCKHDQ instruction interleaves the high-order 32 bits of the doubleword of the 
source operand and the high-order 32-bits of the doubleword of the destination operand and 
writes them to the MMX register. 

Note 

If the source operand is all zeros, the result is a zero extension of the high order elements of 
the destination operand. When using the PUNPCKHBW instruction the bytes are zero 
extended, or unpacked into unsigned words. When using the PUNPCKHWD instruction, the 
words are zero exended, or unpacked into unsigned doublewords. 


Example 



Flags Affected 

None. 



5-61 
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Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PUNPCKLBW/PUNPCKLWD/PUNPCKLDQ—Unpack Low 
Packed Data 


Opcode 

Instruction 

Description 

OF 60/r 

PUNPCKLBW mm, mm/m32 

Interleave bytes from the low halves of MMX register and MMX 
reg/memory into MMX register. 

OF 61 /r 

PUNPCKLWD mm, mm/m32 

Interleave words from the low halves of MMX register and MMX 
reg/memory into MMX register. 

OF 62 /r 

PUNPCKLDQ mm, mm/m32 

Interleave dwords from the low halves of MMX register and MMX 
req/memorv into MMX reqister. 


Operation 

IF instruction is PUNPCKLBW 
THEN{ 

mm(63..56) <-mm/m32(31..24); 
mm(55..48) <- mm(31..24); 
mm(47..40) <- mm/m32(23..16); 
mm(39..32) <- mm(23..16): 
mm(31 ..24) <r- mm/m32(15..8); 
mm(23..16) <-mm(15..8); 
mm(15..8) <- mm/m32(7..0); 
mm(7..0) mm(7..0); 

} 

ELSE IF instruction is PUNPCKLWD 
THEN{ 

mm(63..48) <- mm/m32(31..16): 
mm(47..32) <- mm(31 ..16); 
mm(31 ..16) <- mm/m32(15..0); 
mm(15..0) «- mm(15..0); 

} 

ELSE{ (* instruction is PUNPCKLDQ *) 
mm(63..32) ^ mm/m32(31..0): 
mm(31..0) <- mm(31..0): 

} 


Description 

The PUNPCKL instructions unpack and interleave the low-order data elements of the 
destination and source operands into the destination operand. 

The destination operand is an MMX register. The source operand can either be an MMX 
register or a 32-bit memory operand. When the source data comes from 64-bit registers, the 
upper 32 bits are ignored. 
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When unpacking from a memory operand, only 32 bits are accessed. The instruction uses all 
32 bits. 

The PUNPCKLBW instruction interleaves the four low-order bytes of the source operand and 
the four low-order bytes of the destination operand and writes them to the MMX register. 

The PUNPCKLWD instruction interleaves the two low-order words of the source operand 
and the two low-order words of the destination operand and writes them to the MMX 
register. 

The PUNPCKLDQ instruction interleaves the low-order doubleword of the source operand 
and the low-order doubleword of the destination operand and writes them to the MMX 
register. 

Note 

If the source operand has a value of all zeros, the result is a zero extension of the low order 
elements of the destination operand. When using the PUNPCKLBW instruction the bytes are 
zero extended, or unpacked into unsigned words. When using the PUNPCKLWD instruction, 
the words are zero extended, or unpacked into unsigned doublewords. 


Example 



Flags Affected 

None. 


5-64 
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Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 



Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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PXOR —Bitwise Logical Exclusive OR 


Opcode 

Instruction 

Description 

OF EF/r 

PXOR mm, mm/m64 

XOR 64 bits from MMX req/memory to MMX reqister. 


Operation 

mm <-mm XOR mm/m64; 

Description 

The PXOR instruction performs a bitwise logical XOR on the 64 bits of the destination with 
the source operands and writes the result to destination register. 

Each bit of the result is 1 if the corresponding bits of the two operands are different. Each bit 
is 0 if the corresponding bits of the operands are the same. 

The source operand can either be an MMX register or a 64 bit memory operand. 


Example 



PXOR mm, mm/m64 


mm 

1111111111111000000000000000010110110101100010000111011101110111 



A 


mm/m64 

0001000011011001010100000011000100011110111011110001010110010101 



1 


mm 

1110111100100001010100000011010010101011011001110110001011100010 




»06033 


Flags Affected 

None. 
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Protected Mode Exceptions 

#GP(0) for an illegal memory operand effective address in the CS, DS, ES, FS or GS 
segments; #SS(0) for an illegal address in the SS segment; #PF(fault-code) for a page fault; 
#AC for unaligned memory reference if the current privilege level is 3; #UD if CRO.EM = 1; 
#NM if TS bit in CRO is set; #MF if there is a pending FPU exception. 

Real Address Mode Exceptions 

Interrupt 13 if any part of the operand lies outside of the effective address space from 0 to 
OFFFFH; #UD if CRO.EM = 1; #NM if TS bit in CRO is set; #MF if there is a pending FPU 
exception. 

Virtual 8086 Mode Exceptions 

Same exceptions as in Real Address Mode; #PF(fault-code) for a page fault; #AC for 
unaligned memory reference. 
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APPENDIX A 

lA MMX™ INSTRUCTION SET SUMMARY 


Table A-1 summarizes the lA MMX™ instruction set base mnemonics. The instructions are 
grouped by categories of related functions. 

Most of the instructions have multiple variations that are not listed in Table A-1. For 
example, PADD has the following variations: PADDB, PADDW, and PADDD. The 
instruction variations and mnemonics are detailed in the Instruction description section of 
Chapter 5. 
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Table A-1. lA MMX Instruction Set Summary, Grouped into Functional Categories 


CATEGORY 

Arithmetic 

addition 

subtraction 

multiplication 

multiply and add 

Wraparound 

d Hi 


PADD 

PSUB 

PMULLVH 

PMADD 

PADDS 

PSUBS 

PADDUS 

PSUBUS 

Comparison compare 

PCMPEQ 





PCMPGT 




Conversion pack 


PACKSS 

PACKUS 

unpack 

PUNPCKL/H 




Logical 

Packed 

Full 64-bit 

and 



PAND 


and not 



PANDN 

or 



POR 


exclusive or 



PXOR 


Shift shift left logical 

PSLL 


PSLL 


shift right logical 

PSRL 


PSRL 


shift right arithmetic 

PSRA 




Data Transfer Operations 

32-bit transfers 

64-blt transfers 

registers-register 





load from memory 

MOVD 




store to memory 

MOVD 


MOVQ 



FP and WIMX^m State Management 


EMMS 
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APPENDIX B 

lA MMX™ INSTRUCTION FORMATS AND 

ENCODINGS 


B.1. INSTRUCTION FORMATS 

All MMX instructions, except the EMMS instruction, use the same format similar as the two- 
byte Intel Architecture integer operations. Details of subfield encodings within these formats 


are presented below. 

_Table B-1. Encoding of Granularity of Data (gg) Field 


gg 

Granularity of Data 

00 

packed bytes 

01 

packed words 

10 

packed doublewords 

11 

quadword 
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Table B-2. Encoding of 32-bit General Purpose (reg) Field for Register-to-Register 
_Operations__ 


reg Field 

Register Selected 

000 

EAX 

001 

ECX 

010 

EDX 

oil 

EBX 

100 

ESP 

101 

EBP 

110 

ESI 

111 

EDI 


NOTE; For register-to-register operations, the decoding of integer registers is independent of processor 
mode. For register-to-memory operations, the effective address is calculated based on the processor mode in 
effect. 


Table B-3. Encoding of 64-bit MMX™ Register (mmxreg) Field 


mmxreg Field 

MMX Register Selected 

000 

mmO 

001 

mml 

010 

mm2 

oil 

mm3 

100 

mm4 

101 

mmS 

110 

mm6 

111 

_ 

mm7 


For more details, see Table 25-2, Table 25-3, and Appendix F of the Pentium® Processor 
Family Developer’s Manual. 
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B.2. INSTRUCTION ENCODINGS AND DATATYPE CROSS- 
REFERENCE 

For each MMX instruction, Table B-4 lists instruction encodings and the datatypes 
supported—byte (B), word (W), doubleword (DW), and quadword (QW). 


o 

= output 

I 

= input 

s 

= signed saturation 

u 

= unsigned saturation 

n/a 

= not applicable 


Figure B-1. Key to Codes for Datatype Cross-Reference 



Table B-4. lA MMX™ Instruction Formats and Encodings 


Instruction 

Format 

B 

W 

DW 

QW 

EMMS - Empty 
MMX state 

0000 1111:01110111 

n/a 

n/a 

n/a 

n/a 

MOVD - Move 
doubleword 

reg to mmxreg 

reg from mmxreg 

mem to mmxreg 

mem from 
mmxreg 

0000 1111:01101110:11 mmxreg reg 

0000 1111:01111110:11 mmxreg reg 

0000 1111:01101110: mod mmxreg r/m 

0000 1111:01111110: mod mmxreg r/m 

N 

N 

Y 

N 

MOVQ - Move 
quadword 

mmxreg2 to 
mmxreg 1 

mmxreg2 from 
mmxregl 

mem to mmxreg 

mem from 
mmxreg 

0000 1111:01101111:11 mmxregl mmxreg2 

0000 1111:01111111:11 mmxregl mmxreg2 

0000 1111:01101111: mod mmxreg r/m 

0000 1111:01111111: mod mmxreg r/m 

N 

N 

N 



I 
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Table B-4. lA MMX™ Instruction Formats and Encodings (Contd.) 


Instruction Format 

B 

w 

DW 

QW 

PACKSSDW1 - 
Pack dword to 
word data 
(signed with 
saturation) 

mmxreg2 to 0000 1111:01101011:11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:01101011: mod mmxreg r/m 

mmxreg 

n/a 

0 


n/a 

PACKSSWB1 - 
Pack word to 
byte data (signed 
with saturation) 

mmxreg2 to 0000 1111:01100011:11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:01100011: mod mmxreg r/m 

mmxreg 

O 

1 

n/a 

n/a 

PACKUSWB1 - 
Pack word to 
byte data 
(unsigned with 
saturation) 

mmxreg2 to 0000 1111:01100111:11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:01100111: mod mmxreg r/m 

mmxreg 

O 

1 

n/a 

n/a 

PADD - Add with 
wrap-around 

mmxreg2 to 0000 1111:111111 gg: 11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:111111 gg: mod mmxreg r/m 

mmxreg 

1 

1 

1 

N 


B-4 
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Table B-4. lA MMX™ Instruction Formats and Encodings (Contd.) 


Instruction Format 

B 

W 

DW 

QW 

PADDS- Add 
signed with 
saturation 

mmxreg2 to 0000 1111; 111011 gg: 11 mmxregl mmxreg2 

mmxregl 

memory to reg 0000 1111:111011 gg; mod reg r/m 

Y 

Y 

N 

N 

PADDUS- Add 
unsigned with 
saturation 

mmxreg2 to 0000 1111:110111 gg: 11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:110111 gg: mod mmxreg r/m 

mmxreg 

Y 

Y 

N 

N 

PAND - Bitwise 

And 

mmxreg2 to 0000 1111:11011011:11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:11011011: mod mmxreg r/m 

mmxreg 

N 

N 

N 

Y 

PANDN - Bitwise 

AndNot 

mmxreg2 to 0000 1111:11011111:11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:11011111: mod mmxreg r/m 

mmxreg 

N 

N 

N 

1 

PCMPEQ- 
Packed compare 
for equaiity 

mmxreg2 with 0000 1111:011101 gg: 11 mmxregl mmxreg2 

mmxregl 

memory with 0000 1111:011101 gg: mod mmxreg r/m 

mmxreg 

Y 

Y 

1 

N 


I 
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Table B-4. lA MMX™ Instruction Formats and Encodings (Contd.) 


Instruction Format 

B 

W 

DW 

QW 

PCMPGT- 
Packed compare 
greater (signed) 

mmxreg2 with 0000 1111:011001 gg: 11 mmxregl mmxreg2 

mmxregl 

memory with 0000 1111:011001 gg: mod mmxreg r/m 

mmxreg 

1 

1 

1 

N 

PMADD - Packed 
muitipiy add 

mmxreg2 to 0000 1111:11110101:11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:11110101: mod mmxreg r/m 

mmxreg 

n/a 

1 

0 

n/a 

PMULH - Packed 
muitiplication 

mmxreg2 to 0000 1111:11100101:11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:11100101: mod mmxreg r/m 

mmxreg 

N 

1 

N 

N 

PMULL - Packed 
muitipiication 

mmxreg2 to 0000 1111:11010101:11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:11010101: mod mmxreg r/m 

mmxreg 

N 

Y 

N 

N 

POR - Bitwise Or 

mmxreg2 to 0000 1111:11101011:11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:11101011: mod mmxreg r/m 

mmxreg 

N 

N 

N 

1 


B-6 
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Table B-4. lA MMX™ Instruction Formats and Encodings (Contd.) 


Instruction 

Format 

PSLL2 - Packed 
shift left logical 


mmxreg2 by 
mmxregl 

0000 1111:1 111OOgg: 11 mmxregl mmxreg2 

mmxreg by 
memory 

0000 1111:1111 OOgg: 11 mmxreg r/m 

mmxreg by 
immediate 

0000 1111:0111 OOgg: 11110 mmxreg: immS data 

PSRA2 - Packed 
shift right 
arithmetic 


mmxreg2 by 
mmxregl 

0000 1111:111 OOOgg; 11 mmxregl mmxreg2 

mmxreg by 
memory 

0000 1111:111 OOOgg: 11 mmxreg r/m 

mmxreg by 
immediate 

0000 1111:0111 OOgg: 11 100 mmxreg: imm8 data 

PSRL2-Packed 
shift right logical 


mmxreg2 by 
mmxregl 

0000 111 1:1101 OOgg: 11 mmxregl mmxreg2 

mmxreg by 
memory 

0000 1111:1101 OOgg: 11 mmxreg r/m 

mmxreg by 
immediate 

0000 1111:01110Ogg: 11010 mmxreg: imm8 data 

PSUB - Subtract 
with wrap¬ 
around 


mmxreg2 to 
mmxregl 

0000 1111:1111 lOgg: 11 mmxregl mmxreg2 

memory to 
mmxreg 

0000 1111:1111 lOgg: mod mmxreg r/m 



B-7 
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Table B-4. lA MMX™ Instruction Formats and Encodings (Contd.) 


Instruction Format 

B 

W 

DW 

QW 

PSUBS- 
Subtract signed 
with saturation 

mmxreg2 to 0000 1111:11101 Ogg: 11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:11101 Ogg: mod mmxreg r/m 

mmxreg 

Y 

Y 

N 

N 

PSUBUS- 
Subtract 
unsigned with 
saturation 

mmxreg2to 0000 1111:11011 Ogg: 11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:11011 Ogg: mod mmxreg r/m 

mmxreg 

Y 

Y 

N 

N 

PUNPCKH - 
Unpack high 
data to next 
iarger type 

mmxreg2 to 0000 1111:01101 Ogg: 11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:01101 Ogg: mod mmxreg r/m 

mmxreg 

1 

1 

1 

N 

PUNPCKL- 
Unpack tow data 
to next iarger 
type 

mmxreg2 to 0000 1111:0110OOgg: 11 mmxregl mmxreg2 

mmxregl 

memory to 0000 1111:0110OOgg: mod mmxreg r/m 

mmxreg 

1 

1 

1 

N 
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Table B-4. lA MMX™ Instruction Formats and Encodings (Contd.) 


Instruction 

Format 

B 

W 

DW 

QW 

PXOR - Bitwise 
Xor 

mmxreg2 to 
mmxregl 

memory to 
mmxreg 

0000 1111:11101111:11 mmxregl mmxreg2 

0000 1111:11101111: mod mmxreg r/m 

N 

N 

N 

Y 


NOTE: 


1. The PACK instructions perform saturation from signed packed data of one type to signed or unsigned data 
of the next smaller type. 

2. The format of shift instructions has one additional format to support shifting by immediate shift-counts. 

The shift operations are not supported equally for all data types. 
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APPENDIX C 
ALPHABETICAL LIST OF lA MMX™ 
INSTRUCTION SET MNEMONICS 


The following table lists the mnemonics of the lA MMX™ instructions in alphabetical order. 
For each mnemonic, it summarizes the type of source data, the encoding of the first and 
second bytes in hexadecimal, and the format used. 


Table C-1. lA MMX™ Instruction Set Mnemonics 


MNEMONIC 

OPERAND TYPES 

Byte 1 

Byte 2 

Byte 3, [4] 

EMMS 

None 

OF 

77 

mod-rm, [sib] 

MOVD 

register, memory/iregister 

OF 

6E 

mod-rm, [sib] 

MOVD 

memory/iregister, register 

OF 

7E 

mod-rm, [sib] 

MOVQ 

register, memory/register 

OF 

6F 

mod-rm, [sib] 

MOVQ 

memory/register, register 

OF 

7F 

mod-rm, [sib] 

PACKSSDW 

register, memory/register 

OF 

6B 

mod-rm, [sib] 

PACKSSWB 

register, memory/register 

OF 

63 

mod-rm, [sib] 

PACKUSWB 

register, memory/register 

OF 

67 

mod-rm, [sib] 

PADDB 

register, memory/register 

OF 

FC 

mod-rm, [sib] 

PADDD 

register, memory/register 

OF 

FE 

mod-rm, [sib] 

PADDSB 

register, memory/register 

OF 

EC 

mod-rm, [sib] 

PADDSW 

register, memory/register 

OF 

ED 

mod-rm, [sib] 

PADDUSB 

register, memory/register 

OF 

DC 

mod-rm, [sib] 

PADDUSW 

register, memory/register 

OF 

DD 

mod-rm, [sib] 

PADDW 

register, memory/register 

OF 

FD 

mod-rm, [sib] 

PAND 

register, memory/register 

OF 

DB 

mod-rm, [sib] 

PANDN 

register, memory/register 

OF 

DF 

mod-rm, [sib] 

PCMPEQB 

register, memory/register 

OF 

74 

mod-rm, [sib] 

PCMPEQD 

register, memory/register 

OF 

76 

mod-rm, [sib] 



C-1 







































































































Table C-1. lA MMX^“ Instruction Set Mnemonics (Contd.) 


MNEMONIC 

OPERAND TYPES 

Byte 1 

Byte 2 

Byte 3, [4] 

PCMPEQW 

register, memory/register 

OF 

75 

mod-rm, [sib] 

PCMPGTB 

register, memory/register 

OF 

64 

mod-rm, [sib] 

PCMPGTD 

register, memory/register 

OF 

66 

mod-rm, [sib] 

PCMPGTW 

register, memory/register 

OF 

65 

mod-rm, [sib] 

PMADDWD 

register, memory/register 

OF 

F5 

mod-rm, [sib] 

PMULHW 

register, memory/register 

OF 

E5 

mod-rm, [sib] 

PMULLW 

register, memory/register 

OF 

D5 

mod-rm, [sib] 

POR 

register, memory/register 

OF 

EB 

mod-rm, [sib] 

PSHIMD* 

register, immediate 

OF 

72 

mod-rm, imm 

PSHIMQ* 

register, immediate 

OF 

73 

mod-rm, imm 

PSHIMW* 

register, immediate 

OF 

71 

mod-rm, imm 

PSLLD 

register, memory/register 

OF 

F2 

mod-rm, [sib] 

PSLLQ 

register, memory/register 

OF 

F3 

mod-rm, [sib] 

PSLLW 

register, memory/register 

OF 

FI 

mod-rm, [sib] 

PSRAD 

register, memory/register 

OF 

E2 

mod-rm, [sib] 

PSRAW 

register, memory/register 

OF 

El 

mod-rm, [sib] 

PSRLD 

register, memory/register 

OF 

D2 

mod-rm, [sib] 

PSRLQ 

register, memory/register 

OF 

D3 

mod-rm, [sib] 

PSRLW 

register, memory/register 

OF 

D1 

mod-rm, [sib] 

PSUBB 

register, memory/register 

OF 

F8 

mod-rm, [sib] 












































































































Table C-1. lA lyiMX^*^ Instruction Set Mnemonics (Contd.) 


MNEMONIC 

OPERAND TYPES 

Byte 1 

Byte 2 

Byte 3, [4] 

PSUBD 

register, memory/register 

OF 

FA 

mod-rm, [sib] 

PSUBSB 

register, memory/register 

OF 

E8 

mod-rm, [sib] 

PSUBSW 

register, memory/register 

OF 

E9 

mod-rm, [sib] 

PSUBUSB 

register, memory/register 

OF 

D8 

mod-rm, [sib] 

PSUBUSW 

register, memory/register 

OF 

D9 

mod-rm, [sib] 

PSUBW 

register, memory/register 

OF 

F9 

mod-rm, [sib] 

PUNPCKHBW 

register, memory/register 

OF 

68 

mod-rm, [sib] 

PUNPCKHDQ 

register, memory/register 

OF 

6A 

mod-rm, [sib] 

PUNPCKHWD 

register, memory/register 

OF 

69 

mod-rm, [sib] 

PUNPCKLBW 

register, memory/register 

OF 

60 

mod-rm, [sib] 

PUNPCKLDQ 

register, memory/register 

OF 

62 

mod-rm, [sib] 

PUNPCKLWD 

register, memory/register 

OF 

61 

mod-rm, [sib] 

PXOR 

register, memory/register 

OF 

EF 

mod-rm, [sib] 


Notes: 

* These are not the actual mnemonics: 

PSHIMD represents the PSLLD, PSRAD and PSRLD instructions when shifting by immediate shift counts. 
PSHIMW represents the PSLLW, PSRAW and PSRLW instructions when shifting by immediate shift counts. 
PSHIMQ represents the PSLLQ and PSRLQ instructions when shifting by immediate shift counts. 

The instructions that shift by immediate counts are differentiated by the ModR/M bytes (See Appendix B). 
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APPENDIX D 

lA MMX™ INSTRUCTION SET OPCODE MAP 


The detailed encodings of the Intel Architecture MMX™ instructions are listed in the shaded 
boxes of the Opcode Map tables below. All MMX instructions, except the EMMS 
instruction, use the same format as the two-byte Intel Architecture integer operations. 

All blanks in the Opcode Map are reserved and should not be used. Do not depend on 
the operation of unspecifled opcodes. OFOBh or 0FB9h should be used when deliberately 
generating an illegal opcode exception. 


Key to Abbreviations 

operands are identified by a two-character code of the form Zz. The first character, an 
uppercase letter, specifies the addressing method; the second character, a lowercase letter, 
specifies the type of operand. For opcodes with two operands, the left code refers to the 
destination operand and the right code refers to the source operand. All MMX instructions, 
except the EMMS instruction, reference and operate on two operands. 


Codes for Addressing Method 

C The reg field of the ModR/M byte selects a control register; e.g., MOV (0F20, 0F22). 

D The reg field of the ModR/M byte selects a debug register; e.g., MOV (0F21, 0F23). 

E A ModR/M byte follows the opcode and specifies the operand. The operand is either a 
general register or a memory address. If it is a memory address, the address is computed 
from a segment register and any of the following values: a base register, an index 
register, a scaling factor, a displacement. 

G The reg field of the ModR/M byte selects a general register; e.g., AX(OOO). 

I Immediate data. The value of the operand is encoded in subsequent bytes of the 
instruction. 

M The ModR/M byte may refer only to memory; e.g., LSS, LFS, LGS, CMPXCHG8B. 

P The reg field of the ModR/M byte selects a packed quadword MMX register. 

Q A ModR/M byte follows the opcode and specifies the operand. The operand is either an 
MMX register or a memory address. If it is a memory address, the address is computed 



lA MMX™ INSTRUCTION SET OPCODE MAP 

from a segment register and any of the following values: a base register, an index 
register, a scaling factor, a displacement. 

R The mod field of the ModR/M byte may refer only to a general register; e.g., MOV 
(0F20-0F24, 0F26). 

Codes for Operand Type 

b Byte (regardless of operand size attribute), 
d Doubleword (regardless of operand size attribute), 
p 32-bit or 48-bit pointer, depending on operand size attribute, 
q Quadword (regardless of operand size attribute), 
s Six-byte pseudo-descriptor. 

V Word or doubleword, depending on operand size attribute, 
w Word (regardless of operand size attribute). 



Register Codes 

When an operand is a specific register encoded in the opcode, the register is identified by its 
name, for example: AX, CL, or ESI. The name of the register indicates whether the register 
is 32-bits, 16-bits, or 8-bits wide. A register identifier of the form eXX is used when the 
width of the register depends on the oeprand size attribute; for example, eAX indicates that 
the AX register is used when the operand size attribute is 16 and the EAX register is used 
when the operand size attribute is 32. 
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Table D-1. Opcode Map (First Byte is OFH) 



0 

1 

2 

3 

4 

5 

6 

7 

0 

GRP 6 


LAR 

Gv, Ew 

LSL 

Gv. Ew 



CLTS 


1 









2 

MOV 

Rd, Cd 

MOV 

Rd, Dd 

MOV 

Cd. Rd 

MOV 

Dd, Rd 





3 

WRMSR 

RDTSC 

RDMSR 






4 









5 









6 

PUNPCKLBW 

Pq, Od 

PUNPCKLWD 

Pq.Qd 

PUNPCKLDQ 

lilKiillS 

PACKSSWB 

Pq.Qq 

PCMPGTB 

PCMPGTW 

Pq.Qq 

PCWFGTD 

Pq. Qq 

PACKUSWB 

Pq. Qq 

7 


Grp A 

PCMPEQB 

PCMPEQW 

PCV-PEOD 

EMMS 



PSHIMW 

PSHlMD 

PSHIMQ 


Pq. Qq 

Pq. Qq 


8 

Long-displacement jump on condition (Jv) 


JO 

JNO 

JB 

JNB 

JZ 

JNZ 

JBE 

JNBE 

9 

Byte Set on condition (Eb) 


SETO 

SETNO 

SETB 

SETNB 

SETZ 

SETNZ 

SETBE 

SETNBE 

A 

PUSH 

FS 

POP 

FS 

CPUID 

BT 

Ev, Gv 

SHLD 

Ev, Gv. lb 

SHLD 

Ev, Gv, CL 



B 

CMPXCH 

Eb. Gb 

CMPXCH 

Ev, Gv 

|n 

BTR 

Ev, Gv 

■n 

|n 


MOVZX 

Gv, Ew 

C 

XADD 

Eb, Gb 

XADD 

Ev, Gv 






GRP 9 

D 


PSRLW 

Pq. Qq 

PSRLD 

.PqvQg. 



PMULLW 

.Pq-. Oq. 



E 


P3RAW 

Pq, Qp 

PSRAO 

. .pp.Qq... 



PMULHVV 

Pq. Qq 



F 


PSLLW 

Pq, Op 

PSLLD 

_Pq' Qq_ 

PSLLQ 

Pq. Op 


PMADOWD 

Pq. Qq 


















































lA MMX™ INSTRUCTION SET OPCODE MAP 


inl^. 

Table D-1. Opcode Map (First Byte is OFH) (Contd) 



8 

9 

A 

B 

c 

D 

E 

F 

0 

INVD 

WB INVD 


Illegal 

opcode 





1 









2 









3 









4 









5 









6 



PUNPCKHDQ 

PACKSSDW 



MOVD 

MOVQ 



Pq.Oq 


Pq.Oq 



PQ,Ed 


7 







MOVO 

MOVQ 








6d, Pd 


8 

Long-displacement jump on condition (Jv) 


JS 

JNS 

jp 

JNP 

JL 

JNL 

JLE 

JNLE 


Byte Set on condition (jv) 

9 

SETS 

SETNS 

SETP 

SETNP 

SETL 

SETNL 

SETLE 

SETNLE 


Eb 

Eb 

Eb 

Eb 

Eb 

Eb 

Eb 

Eb 

A 

PUSH 

POP 

RSM 


SHRD 

SHRD 


IMUL 


GS 

GS 



Ev, Gv, lb 

Ev, Gv, CL 


Gv, Ev 

B 


Illegal opcode 

GRP 8 


BSF 

BSR 

MOVSX 

MOVSX 




Ev. lb 


Gv, Ev 

Gv, Ev 

Gv, Eb 

Gv, Ew 

C 

BSWAP 

BSWAP 

BSWAP 

BSWAP 

BSWAP 

BSWAP 

BSWAP 

BSWAP 


EAX 

ECX 

EDX 

EBX 

ESP 

EBP 

ESI 

EDI 






















































ADC 

SBB 

AND 

SUB 

XOR 

CMP 

RCL 

RCR 

SHL SAL 

SHR 


SAR 

NOT 

NEG 

MUL 

IMUL 

DIV 

IDIV 



AUeAX 

AL/eAX 

AL/eAX 

AL/eAX 



LMSW 

Ew 

INVLPG 

BTR 

BTC 
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NORTH AMERICAN SALES OFFICES 


ARIZONA 

Intel Corp. 

410 North 44th Street 
Suite 470 
Phoenix 85008 
Tel: (800) 628-8686 
FAX: (602) 244-0446 

CALIFORNIA 
Intel Corp. 

26707 W. Agoura Road 
Suite 203 

Calabasas. CA 91302 
Tel: (800) 628-8686 
FAX: (818)-880-1820 

Intel Corp. 

3550 Watt Avenue 
Suite 140 
Sacramento 95821 
Tel: (800) 628-8686 
FAX: (916) 979-7011 

Intel Corp. 

9655 Granite Ridge Drive 
3rd Floor 
Suite 4A 

San Diego 92123 
Tel: (800) 628-8686 
FAX: (619) 467-2460 

Intel Corp. 

1781 Fox Drive 
San Jose 95131 
Tel: (800) 628-8686 
FAX: (408) 441-9540 

Intel Corp. 

1551 North Tustin Avenue 
Suite 800 
Santa Ana 92701 
Tel: (800) 628-8686 
TWX:(910) 595-1114 
FAX: (714)541-9157 

Intel Corp. 

514 Via de la Valle 
Suite 20S-RCO 
Solana Beach 92075 

Intel Coip. 

1960 E. Grand Avenue 
Suite 150 

El Segundo, CA 90245 
Tel: (800) 628-8686 
FAX: (310) 640-7133 

COLORADO 
Intel Corp. 

600 South Cherry Street 
Suite 700 
Denver 80222 
Tel: (800) 628-8686 
TWX: 910-931-2289 
FAX: (303) 322-8670 

CONNECTICUT 
Intel Corp. 

40 Old Ridgebury Road 
Suite 311 
Danbury 06811 
Tel: 800) 628-8686 
FAX: (203) 778-2168 

FLORIDA 
Intel Corp. 

600 West Hillsboro Blvd. 
Suite 348 

Deerfield Beach 33441 
Tel: (800) 628-8686 
FAX: (305) 421-2444 


Intel Corp. 

2250 Lucien Way 
Suite 100 
Suite 8 

Maitland 32751 
Tel: (800) 628-8686 
FAX: (407) 660-1283 

GEORGIA 

Intel Corp. 

20 Technology Park 
Suite 150 
Norcross 30092 
Tel: (800) 628-8686 
FAX: (404) 448-0875 

IDAHO 

Intel Corp. 

910 W. Main Street 
Suite 236 
Boise 83702 
Tel: (800) 628-8686 
FAX: (208) 331-2295 


ILLINOIS 
Intel Corp. 

300 North Martingale Road 
Suite 400 

Schaumburg 60173 
Tell: (800) 628-8686 
FAX: (708) 605-9762 

INDIANA 

Intel Corp. 

8041 Knue Road 
Indianapolis 46250 
Tel: (800) 628-8686 
FAX: (317) 577-4939 

MARYLAND 
Intel Corp. 

131 National Bus. Pkwy 
Suite 200 

Annapolis Junction 20701 
Tel: (800) 628-8686 
FAX: (301) 206-3678 

MASSACHUSETTS 

Intel Corp. 

Nagog Park 
125Nagog Park 
Acton 01720 
Tel: (800) 628-8686 
FAX: (508) 266-3867 

MICHIGAN 
Intel Corp. 

32255 North Western Hwy. 
Suite 212, Tri Atria 
Farmington Hills 48334 
Tel: (800) 628-8686 
FAX: (313) 851-8770 

MINNESOTA 
Intel Corp. 

3500 West 80th Street 
Suite 360 
Bloomington 55431 
Tel: (800) 628-8686 
TWX: 910-576-2867 
FAX: (612) 831-6497 

NEW JERSEY 

Intel Corp. 

2001 Route 46 
Suite 310 
Parsippany 07054 
Tel: (800) 628-8686 


Intel Corp. 

Lincroft Center 
125 Half Mila Road 
Red Bank 07701 
Tel: (800) 628-8686 
FAX: (908) 747-0983 

NEW YORK 
Intel Corp. 

850 Cross Keys Office Pk 
Fairport 14450 
Tel: (800) 628-8686 
TWX: 510-253-7391 
FAX: (716) 223-2561 

Intel Corp. 

2950 Expressway Drive 
Islandia 11722 
Tel: (800) 628-8686 
TWX: 510-227-6236 
FAX: (516) 348-7939 

OHIO 

Intel Corp. 

56 Milford Drive 
Suite 205 
Hudson 44236 
Tel: (800) 628-8686 
FAX: (216) 528-1026 

•f Intel Corp. 

3401 Park Center Drive 
Suite 220 
Dayton 45414 
Tel: (800) 628-8686 
TWX: 810-450-2528 
FAX: (513) 890-8658 

OKLAHOMA 
Intel Corp. 

6801 North Broadway 
Suite 115 

Oklahoma City 73162 
Tel: (800) 628-8686 
FAX: (405) 840-9819 

OREGON 

Intel Corp. 

15254 NW Greenbrier 
Pk^ 

Building B 
Beaverton 97006 
Tel: (800) 628-8686 
TWX: 910-467-8741 
FAX: (503) 645-8181 

PENNSYLVANIA 

Intel Corp. 

925 Harvest Drive 
Suits ZOO 
Blue Bell 19422 
Tel: (800) 628-8686 
FAX: (215) 641-0785 

SOUTH CAROLINA 

Intel Corp. 

7403 Parklane Road 
Suite 4 

Columbia 29223 
Tel: (800) 628-8686 
FAX: (803) 788-7999 

Intel Corp. 

100 Executive Center Dr 
Suite 109, B183 
Greenville 29615 
Tel: (800) 628-8686 
FAX: (803) 297-3401 


TEXAS 
Intel Corp. 

8911 Capital of Texas Hwy 
Suite 4230 
Austin 78759 
Tel: (800) 628-8686 
FAX: (512) 338-9335 

Intel Corp. 

5000 Quorum Drive 
Suite 750 
Dallas 75240 
Tel: (800) 628-8686 
FAX: (214) 233-1325 

Intel Corp. 

20405 State Hwy 249 
Suite 830 
Houston 77070 
Tel: (800) 628-8686 
TWX: 910-881-2490 
FAX: (713) 376-2891 

UTAH 

Intel Corp. 

428 East 6400 South 
Suite 135 
Murray 84107 
Tel: (800) 628-8686 
FAX: (801) 268-1457 

WASHINGTON 
Intel Corp. 

2800 156lh Avenue SE 
Suite 105 
Bellevue 98007 
Tel: (800) 628-8686 
FAX: (206) 746-4495 

WISCONSIN 
Intel Corp. 

400 North Executive Drive 
Suite 401 
Brookfield 53005 
Tel: (800) 628-8686 
FAX: (414) 789-2746 


CANADA 

BRITISH COLUMBIA 

Intel of Canada, Ltd. 

999 Canada Place 
Suite 404 
Suite 11 

Vancouver V6C 3E2 
Tel: (800) 628-8686 
FAX: (604) 844-2813 

ONTARIO 

Intel of Canada, Ltd. 
2650 Queensview Drive 
Suite 250 
Ottawa K2B 8H6 
Tel: (800) 628-8686 
FAX: (613) 820-5936 

Intel of Canada, Ltd. 

190 Attwell Drive 
Suite 500 
Rexdale M9W 6H8 
Tel: (800) 628-8686 
FAX: (416) 675-2438 

QUEBEC 

Intel of Canada, Ltd. 

1 Rue Holiday. Tour West 
Suite 320 
R. Claire H9R 5N3 
Tel: (800) 628-8686 
FAX: 514-694-0064 




NORTH AMERICAN SERVICE OFFICES 

Computervision 

Intel Corporation's North American Preferred Service Provider 

Central Dispatch: 1-800-876-SERV (1-800-876-7378) 


ALABAMA 

Birmingham 

Huntsvilie 

ALASKA 

Anchorage 

ARIZONA 

Phoenix 

Tucson 

ARKANSAS 

North Little Rock 
CALiFORNIA 
Concord 
Los Angeles 
Ontario 
Orange 
Redwood City 
Sacramento 
San Oiego 
San Francisco 
Van Nuys 

COLORADO 

Colorado Springs 
Denver 

CONNECTICUT 
E. Windsor 
Middiebury 

FLORIDA 
FI. Lauderdale 
Jacksonville 
Miami 
Oriando 
Pensacola 
Tampa 


GEORGIA 
Atlanta 
HAWAII 
Honolulu 
ILLINOIS 
Chicago 
Wood Dale 

INDIANA 
Carmel 
Evansville 
FI. Wayne 
South Bend 

IOWA 

Cedar Rapids 
Davenport 
West Des Moines 

KANSAS 
Kansas City 
Wichita 

KENTUCKY 

Louisville 

Madisonville 

LOUISIANA 
Baton Rouge 
New Orleans 

MAINE 

Auburn 

MARYLAND 

Baltimore 

MASSACHUSETTS 
Bedfoird 
S. Easton 


MICHIGAN 

Detroit 

Flint 

Grand Rapids 

Lansing 

Troy 

MINNESOTA 

Minneapolis 

MISSOURI 
Springfield 
Street Louis 

MISSISSIPPI 

Jackson 

NEW HAMPSHIRE 
Manchester" 

MONTANA 

Butte 

NEBRASKA 

Omaha 

NEW JERSEY 
Cherry Hill 
Hamilton Township 
Westfield 

NEW MEXICO 
Albuquerque 

NEW YORK 
Albany 
Binghampton 
Buffalo 
Farmingdale 
New York City 
Rochester 
Dryden 


NORTH CAROLINA 

Ashville 

Charlotte 

Greensboro 

Raleigh 

Wilmington 

OHIO 

Cincinnati 

Cleveland 

Columbus 

Dayton 

OKLAHOMA 

Oklahoma City 
Tulsa 

OREGON 
Beaverton 
PENNSYLVANIA 
Camp Hill 
Erie 

Pittsburgh 

Wayne 

SOUTH CAROLINA 

Charleston 

Columbia 

Greenville 

TENNESSEE 

Chattanooga 

Knoxville 

Memphis 

Nashville 

TEXAS 

Austin 

Houston 

Dallas 

Tyler 


UTAH 

Salt Lake City 
VERMONT 
White River Junction 
VIRGINIA 
Charlottesville 
Richmond 
Roanoke 
Virginia Beach 

WASHINGTON 

Renton 

Richland 

WASHINGTON D.C.* 

WEST VIRGINIA 
Charleston 
WISCONSIN 
Milwaukee 
CANADA 
Calgary 
Edmonton 
Fredericton 
Halifax 
Mississauga 
Montreal 
Ottawa 
Toronto 

Vancouver, BC" 
Winnipeg 
Quebec City 
Regina 
St.John’s 
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UNITED STATES, Intel Corporation 

2200 Mission College Blvd., P.O. Box 58119, Santa Clara, CA 95052-8119 

Tel: +1 408 765-8080 

JAPAN, Intel Japan K.K. 

5-6 Tokodai, Tsukuba-shi, Ibaraki-ken 300-26 
Tel: + 81-29847-8522 

FRANCE, Intel Corporation S.A.R.L. 

1, Quai de Crenelle, 75015 Paris 
Tel: +33 1-45717171 

UNITED KINGDOM, Intel Corporation (U.K.) Ltd. 

Pipers Way, Swindon, Wiltshire, England SN31RJ 
Tel: +44 1-793-641440 

GERMANY, Intel GmbH 
Dornacher Strasse 1 
85622 Feidkirchen/ Muenchen 
Tel: +49 89/99143-0 

HONG KONG, Intel Semiconductor Ltd. 

32/F Two Pacific Place, 88 Queensway, Central 
Tel: +852 2844-4555 

CANADA, Intel Semiconductor of Canada, Ltd. 

190 Attweil Drive, Suite 500 
Rexdale, Ontario MOW 6H8 
Tel: +416 675-2438 
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